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BANANA PROTEINS, DNA, AND DNA REGULATORY ELEMENTS 
ASSOCIATED WITH FRUIT DEVELOPMENT 

BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention relates to genes which are differentially expressed 
during banana fruit development, the protein products of these genes, and DNA 
regulatory elements which are differentially expressed during banana fruit 
10 development. 



Description of the Related Art 

Bananas represent a crop of great importance to both the world economy 
and as a means of supplying subsistence to a large portion of the world's 

15 population. The global banana export market is about 10% of the world's 
production with a $4 billion dollar value. Banana fruit are the fourth most 
important food in the developing world (May, GD et al. (1995) Biotechnology 
13:486-492) with approximately 100 million people acquiring their main energy 
source from bananas. Bananas, like kiwifruit, papayas, and apples, are 

20 climacteric fruit, meaning they ripen in association with an ethylene signal. In 
the ripening process, starch degradation is associated with a respiratory 
climacteric in the fruit. Banana fruit ripening is characterized by a number of 
biochemical and physiological changes including fruit softening, changes in peel 
color and an increase in respiratory activity (Seymour, GB (1993) in: Seymour 

25 GB, et al. (eds) Biochemistry of Fruit Ripening, pp 83-106. Chapman & Hall, 
London). Although ethylene is produced by the fruit, ripening can also be 
stimulated by the application of exogenous ethylene. Alternatively, endogenous 
ethylene production may be stimulated, e.g., by exposing fruit to acetylene. 
More specifically, the post-harvest physiology of the banana {Musa 

30 acuminata cv. Grand Nain) is characterized by initial harvest, a green storage 

phase, followed by a burst in ethylene production that signals the beginning of the 
climacteric period. Associated with this respiratory climacteric is a massive 
conversion of starch to sugars in the pulp, during which the activities of enzymes 
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involved in starch biosynthesis decrease while those involved in starch breakdown 
and mobilization increase rapidly (Wu et al. (1989) Acta PhytophysioL Sin, 
15:145-152; Agravante et al. (1990) J. Jpn. Soc, Food ScL TechnoL 37:911-915; 
lyare et al. (1992) 7. Sci, FoodAgric, 58: 173-176; Cordenunsi et al. (1995), /. 
5 Agric, Food Chem, 43:347-351; Hill et al. (1995) Planta 196:335-343 and 
197:313-323). In addition, the rate of respiration rises sharply (Beaudry et al. 
(1987) Plant Physiol. 8:277-282; Beaudry et al. (1989) Plant Physiol, 91:1436- 
1444), 

Other changes that occur during ripening include: fruit softening as a result 

10 of enzymatic degradation of structural carbohydrates (Agravante et al. (1991) /. 
Jpn, Soc. Food Sci. TechnoL 38:527-532; Kojimaet al. (1994) Physiol. Plant. 
90:772-778); a decline in those polyphenol compounds responsible for the 
astringency of the green unripe ftnit which are catalyzed by polyphenol oxidase 
and peroxidases (Mendoza et al. (1994) in I Uritani et al., eds., Postharvest 

15 Biochemistry of Plant Food-Materials in the Tropics. Japan Scientific Societies 

Press, Tokyo, pp 177-191); an increase in the activity of alcohol acetyltransferase, 
the enzyme that catalyzes the synthesis of isoamyl acetate - the major aroma 
compound of banana fruit (Harada et al. (1985) Plant Cell Physiol. 26:1067- 
1074); and a de-greening of the peel as a result of chlorophyll breakdown by 

20 chlorophyllase (Thomas et al, (1992) Int. J, Food Sci. TechnoL 27:57-63). Stages 
of banana fruit ripening are scored by peel color index (PCI) numbers, on a scale 
from 1 - very green, to 7 - yellow-flecked with brown flecks (Color Preferences 
Chart, Customer Services Department, Chiquita Brands, Inc.,). PCI can be 
correlated with other biochemical and physiological parameters associated with 

25 fruit development and ripening such as ethylene biosynthesis and respiratory rate. 
The respiratory rate usually peaks at PCI 2 and PCI 4, respectively, in ethylene- 
treated bananas (Agravante et al. (1991) supra). 

Associated with the respiratory climacteric is a large increase in the rate of 
protein synthesis (Mugugaiyan (1993) Geobios, 20:18-21), as well as differential 
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protein accumulation (Dominguez-Puigjaner et al. (1992) Plant PhysioL 98:157- 
162). Poly-galacturonase (PG) has been identified as a protein that increases in 
banana pulp during ripening, as determined by 2-D gel electrophoresis and 
immuno-hybridization {id,). Many of the changes that occur during ripening 
5 require de novo protein synthesis (Areas et al. (1988) J, Food Biochem, 12:51- 
60); therefore, a secondary approach to investigate changes that occur during 
ripening is to isolate transcripts encoding proteins associated with the ripening 
process. Analogous studies of differential gene expression have been successfully 
employed in other plant species. 

10 Other enzymes associated with developing and ripening of fruit include 

proteinase inhibitors and chitinases (Dopico et al. (1993) Plant Molec, Bio, 
21:437), stress-related enzymes (Ledger et al. (1994) Plant Molec, Biol 25:877), 
P-oxidation pathway enzymes (Bojorquez et al. (1995), Plant Molec, Biol. 
28:811), and metabolite-detoxifying enzymes (Picton et al. (1993) Plant Molec. 

15 Biol, 23:193). Chitinases are abundant proteins found in a wide variety of plants. 
Although chitinases are produced by a diversity of plant species, the presence of 
chitin has not been reported in higher plants. Since chitin is the major structural 
component of fungal cell walls, it has been proposed that chitinases serve as 
defense proteins with antifungal activity. Chitinases are reported to be induced in 

20 higher plants by a number of different types of stress (Linthorst (1991) Crit. Rev. 
Plant Sci, 10:123; Punja et al. (1993) /. NematoL 25:526; CoUinge et al. (1993) 
Plant J, 3:31). Many plant chitinases are expressed constitutively, although at a 
low level. 

As noted above, in ripening climacteric fruit, starch degradation is 
25 associated with a respiratory climacteric in the fruit. Reactive oxygen species 
(ROS) are byproducts of cellular respiration, especially under conditions which 
result in high levels of NADH. ROS generation during respiration may be at the 
site of ubiquinones in the electron transport chain. Both yeast and mammalian 
metallothioniens may play a direct role in the cellular defense against oxidative 
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stress by functioning as antioxidants (Dalton et al. (1994) NucL Acids Res. 
22:5016-5203; Tamai et al. (1993) Proc Nat Acad Sci (USA) 90:8013-8017; 
Bauman et al. (1991) ToxicoL AppL Pharmacol. 110:347-354), MT may play an 
additional role in supplying metal ions to Cu- and Zn-superoxide dismutase 
5 (SOD), an enzyme that catalyzes the disproportionation of superoxide anion to 
hydrogen peroxide and dioxygen and is thought to play an important role in 
protecting cells from oxygen toxicity. 

Transcripts encoding MT or MT-like proteins have been isolated from 
many different plants (recently reviewed in Robinson et al. (1993) Biochem 7. 

10 295: 1-10). There is accumulating evidence that the plant MT mRNAs are 

translated, and the protein may have a function in the plant tissues from which 
transcripts have been isolated. A seed-associated polypeptide (E^ protein) has 
been purified from wheat and sequenced (Kawashima et al. 1992), and more 
recently, MT was reported to have been isolated from Arabidopsis (meeting 

15 abstract). Based on deduced amino acid sequences, plant MT proteins are 

approximately 70 aa and have characteristic cysteine-rich regions at the N and C 
termini, separated by a variable spacer region. Based on the number and 
distribution of the cysteine residues, plant MTs have been classified into two 
distinct types (Robinson et al. (1993), supra). Type 1 MTs have 6 N-terminal and 

20 6 C-terminal cysteine residues, whereas type 2 have 8 cysteine residues in the N- 
terminal domain and 6 at the C-terminus. Although there are no strict patterns of 
MT expression, in general type 1 transcript abundance is high in roots, and is 
often metal-inducible, whereas type 2 is expressed primarily in leaves. Other 
transcripts have been isolated that encode proteins with homology to plant MTs 

25 but cannot be classified as either type 1 or type 2, and these include seed-specific 
proteins or transcripts from barley and wheat (see, Robinson et al. (1993), supra). 
In Arabidopsis thaliana, MT proteins are encoded by a gene family containing 
five members, two copies encoding a type 2 MT and 3 encoding a MT with 
homology to type 1 (Zhou et al. (1995) MoL Gen. Genet. 248:318-328). 
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In plants transcripts encoding metallothionein-like proteins have often been 
isolated by differential screening. Type 2 MT have recently been isolated from 
plants expressed in association with senescence, leaf abcission (Coupe et al. 
(1995) Planta 197:442-447), and fruit ripening (Ledger et al. (1994) Plant Molec, 
5 Biol, 25:877-886). Using differential screening. Ledger and Gardner {id,) found 
transcripts encoding MT-like proteins in developing kiwifruit. One, pKIWI503, 
was specifically upregulated late in fruit development, during ripening of the 
mature fruit. 

A major component of the export market is the level of ripening control 

10 which is exerted by modern banana shipping systems. Bananas for export must 
be shipped under refrigeration at 12-14°C, often under controlled atmosphere 
(CA) conditions {i,e., low oxygen combined with CO2), which reduces the effects 
of ethylene produced by the fruit. Exposure to ethylene for 24 hours at 
concentrations of 100-1000 ^1 per liter is used to trigger the ripening climacteric. 

15 This "gassing" step is typically done near the final point in the distribution system. 
Although this system is entirely functional, resulting in marketability of high 
quality fruit with minimal losses, there remains a role for engineered ethylene 
control in the banana export market. Bananas for export are harvested green at 
approximately 75% of full size. This is done to ensure, even with the use of low 

20 temperature and CA, that few if any of the bananas start ripening during shipment. 
Allowing the bananas to remain on the plant longer would result in more 
carbohydrate accumulation to the fruit and a direct, zero cost increase in yield. If 
engineered ethylene control were implemented in banana, this increased yield 
would come at no increased risk of premature ripening during shipment. 

25 Moreover, linking exogenous genes to isolated gene promoters that are 

differentially expressed during banana ripening, and in response to ethylene, 
would allow for the production of exogenous protein in banana tied to the ripening 
process, and in other plants, controlled by ripening or exposure to ethylene. 
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SUMMARY OF THE INVENTION 

Accordingly, a major object of the present invention is to provide isolated 
and purified genes which are differentially expressed during banana fruit 
development, and to provide the protein products of these genes. 
5 A further object of the present invention is to provide DNA regulatory 

elements which are differentially expressed during banana fruit development, and 
chimeric genes comprising these DNA regulatory elements operably linked to 
heterologous DNA molecules, and plants transformed with said chimeric genes, 
providing for controlled expression of said heterologous DNA molecules during 

10 the development of the fruit of said plants, or in response to exogenous 
development signals, such as ethylene signals in said plants. 

A still further object of the present invention is to provide a method for 
expression of a heterologous protein in fruit comprising transforming fruiting 
plants with one or more chimeric genes according to the present invention, 

15 exposing said fruit to the appropriate natural or exogenous development signal, 
such as an ethylene signal, and harvesting fruit containing said heterologous 
protein. The method of the present invention may further comprise isolated the 
proteins produced by said method from the harvested fruit. In a particularly 
preferred embodiment, the heterologous protein is a therapeutic protein, which 

20 may be isolated from the harvested fruit, or consumed directly in the transformed 
fruit by a patient in need of said therapeutic protein. 

With the foregoing and other objects, advantages and features of the 
invention that will become hereinafter apparent, the nature of the invention may be 
more clearly understood by reference to the following detailed description of the 

25 preferred embodiments of the invention and to the appended claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 . Relative abundance of ripening-associated transcripts in banana pulp at 
PCI 1, 3 and 5. Plasmids containing the indicated cDNA were affixed to 
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nylon membrane and hybridized with pulp radio-labeled first-strand 
cDNAs. Relative transcript abundance is expressed in arbitrary units 
(AU). 

Figure 2. Northern analyses of total RNA from pulp and peel (at PCI 3), root, 
5 corm, and leaf tissues hybridized with cDNA probes representing each of 

the eleven classes of differentially expressed transcripts. Putative identities 
of each transcript are indicated to the left of the panel. 

Figure 3. Total banana pulp protein extract at different stages of ripening, 
separated by SDS-PAGE and stained with Coomassie blue. Protein 
10 profiles during ripening show the presence of an abundant protein of 31 

kDa that decreases in relative abundance during ripening. 

Figure 4. Western blot analysis of total soluble protein extracted from different 
banana tissues and hybridized with polyclonal antiserum against purified 
P31. The antiserum detects a 31 kDa protein in pulp which is not present 
15 in peel, meristem, leaf, corm, or root tissue. 

Figure 5. Expression of P31 (top panel) and pBAN3-30 (bottom panel) in banana 
pulp during ripening. Total protein and RNA were isolated from banana 
pulp at each of seven stages of banana fruit ripening (PCI 1 through 7, 
numbered at top of figure). Pulp proteins were separated by SDS-PAGE 
20 and hybridized with the P31 antiserum. Total RNA (10 ^g per lane) was 

separated by agarose gel electrophoresis and transferred to nylon 
membrane, and hybridized with a ^^P-labeled banana chitinase cDNA probe 
(pBAN3-30). Both the P31 protein and the corresponding chitinase 
transcript at 1 .2 kilobases are abundant in pulp during the early stages of 
25 ripening by decrease as ripening progresses. 

Figure 6. Western blot analysis of the translation products of four banana chitinase 
cDNA clones homologous to pBAN3-30 expressed as fusion proteins with 
p-galactosidase in pBluescript and hybridized with P31 antiserum. The 
polyclonal antiserum recognizes a 35 kDa polypeptide in bacterial cultures 
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containing in-frame cDNA inserts (pBAN3-36 and pBAN3-45) that is not 
present in bacterial cells containing either the pBluescript cloning vector 
without an insert (no insert) or chitinase cDNA inserts that are not in-frame 
with the p-galactosidase gene (pBAN3-30 and pBAN3-31). 
5 Figure 7. Complete nucleotide sequence of the cDNA clone pBAN3-30 and 

deduced amino acid sequence of the pBAN3-30 translation product. The 
N-terminal amino acid sequence obtained from purified P31 is aligned with 
the translation product and underlined, and is identical to the deduced 
amino acid sequence of pBAN3-30 at 17 of 20 residues. The translation 

10 initiation codon ATG starting at position 55 of pBAN3-30 is underlined as 

well as the in-frame stop codon at position 1024. Other features of the 
cDNA sequence include several putative polyadenylation signals between 
positions 1136 and 1148 (underlined). 
Figure 8. Amino acid alignments of A) amino- and B)-carboxy-terminal regions of 

15 banana P31 with class III acidic chitinase sequences from chickpea {Cicer 

arietinum, 16), grape {Vitis vinifera, Busam et al. unpublished), 
Arabidopsis thaliana (17), tobacco {Nicotiana tabacum, 18), sugar beet 
{Beta vulgaris, 19). Dots indicate the amino acid residues identical to the 
banana P31 amino acid sequence on the top line. Dashes indicate gaps 

20 introduced to aid the alignment. A) Amino-terminal alignment illustrates 

the lack of sequence homology of the signal-peptide sequence of plant 
chitinases. B) The carboxy-terminal region indicates the 18 residue C- 
terminal extension unique to the banana P31 sequence. 
Figure 9. cDNA sequences of MT F-1 and F-3. 

25 Figure 10. A) Alignment of deduced amino acid sequences of banana and 

kiwiftiiit, apple and papaya fruit-associated metallothionein-like proteins. 
Alignment was performed using Clustal (default settings). Amino acid 
alignment of fruit-associated MTs. Asterisks above the sequence indicate 
the pattern of conserved cysteine residues. A dash denotes a gap inserted 
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in the sequence to aid in alignment. A dot indicates that the amino acid in 
that position is identical to the banana Fl sequence on the top hne. (The 
total number of amino acids is indicated in parentheses at the end of the 
sequence.) B) Phylogenetic tree of plant MT sequences indicating that the 
5 fruit-associated MT are distinct from MTl and MT2. GenBank Accession 

numbers for sequences: banana Fl; banana F3; kiwifruit (1-2781 1); 
papaya (EMBL Y08322); apple (U61974); white spruce (L47746); Vicia 
faba MTlb (X91078); chickpea MTl (Cicer arietinum) (X95708); P. 
sativum MT (Z23097); Oryza sativa MT-2 (D89931); banana MT2; L. 
10 esculentum MT-2 (Z68138); Arabidopsis thaliana MT2b (Ul 1256); 

Arabidopsis thaliana MTlb (Ul 1254); Arabidopsis thaliana MTla (Ul 
1253). 

Figure 11. Northern blot analysis of MT transcript distribution in banana. Total 
RNA (5 ^g/lane) from different banana tissues was separated in a 

15 formaldehyde-containing 2% agarose gel, transferred to nylon membrane, 

and hybridized with an Fl or F3 cDNA probe. The large transcript 
hybridizes more strongly to the Fl probe, and is approximately 540 bases. 
The smaller transcript hybridizes more strongly to the F3 cDNA probe, 
and is approximately 370 bases. Lane labels: Pu — pulp; Pe = perl; R = 

20 root; C = corm; L = leaf. 

Figure 12. Restriction maps of MT genomic clones. The maps represent the 

coding region and at least Ikb of flanking DNA. The approximate scale is 
indicated by a dark bar. 
Figure 13. Nucleotide sequence of MT F3 genomic clone, from the 5' Hindll site 

25 to the 3' Sail site. A 10-base 5* sequence motif beginning at -313 from the 

translation start site (in capital letters) shares homology with an antioxidant 
response element. The putative TATA box (starting at position -96 from 
the translation start site) is underlined, and the three exons (beginning from 
the translation start site) are depicted in capital letters. At the 3' end of the 
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sequence, the stop codon is underlined, as well as a potential 
polyadenylation signal (TA A ATA A A). 
Figure 14. Relative MT transcript abundance in banana pulp-derived protoplasts 
increases in the presence of hydrogen peroxide but not metal ions, as 
5 compared to the untreated control. RNA dot-blots were hybridized to the 

F3 cDNA probe and hybridization signal intensity, expressed in arbitrary 
units (AU), was normalized to 18S rRNA as a measure of total RNA 
loaded. 

Figure 15A-E. Glue. DNA and amino acid sequence 
10 Figure 16A-I. Endo. DNA and amino acid sequence. 

Figure 17A-G. Chitinase DNA and amino acid sequence. 
Figure 18A-C. MT/Fl DNA and amino acid sequence. 
Figure 19A-C. F1/MT#2 DNA and amino acid sequence. 

15 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

OF THE INVENTION 

The present invention provides isolated and purified banana proteins which 
are differentially produced in banana fruit during ripening. In a preferred 

20 embodiment, said proteins are selected from the group consisting of starch 
synthases, granule-bound starch synthases, chitinases, endochitinases, p-1,3 
glucanases, thaumatin-like proteins, ascorbate peroxidases, metallothioneins, 
lectins, and other senescence-related genes. 

The proteins of the present invention may be isolated from ripening fruit 

25 using protein purification methods well known in the art. In particular, fruit 
containing the protein of the present invention may be subjected to 
chromatographic techniques which separate proteins present in the extract 
according to size, affinity and charge. Fractions obtained from each 
chromatographic step are analyzed for the desired enzymatic activity and subjected 
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to further purification steps. A particularly preferable method for obtaining 
purified proteins according to the present invention is high performance liquid 
chromatography (HPLC). 

After a protein according to the present invention has been purified, its 
5 amino acid sequence can be determined using amino acid sequencing methods well 
known in the art. A particularly preferable method is Edman degradation. 
Having obtained sequence information on the protein of the present invention, one 
can design oligonucleotide probes for isolating the DNA encoding the protein of 
the present invention, using conventional screening methods, or amplification 

10 methods such as polymerase chain reaction (PGR). It is particularly preferable to 
design such oligonucleotides in a completely degenerate manner, such that 
oligonucleotides containing each codon encoding a particular amino acid are 
present in the oligonucleotide mix. Alternatively, inosine can be used at positions 
in the codon where degeneracies are known to be present. In a particularly 

15 preferred embodiment, the proteins of the present invention are encoded by a 

DNA molecule selected from the group consisting of clones pBAN 3-33, pBAN 3- 
18, pBAN 3-30, pBAN 3-24, pBAN 1-3, pBAN 3-28, pBAN 3-25, pBAN 3-6, 
pBAN 3-23, pBAN 3-32, and pBAN 3-46. 

The present invention thus further provides an isolated and purified banana 

20 DNA molecule which is differentially expressed in banana fruit during ripening. 
More specifically, the present invention provides a DNA molecule which is 
differentially expressed in banana fruit during ripening, wherein said DNA 
molecule encodes a protein selected from the group consisting of a starch 
synthase, a granule-bound starch synthase, a chitinase, an endochitinase, a p-1,3 

25 glucanase, a thaumatin-like protein, an ascorbate peroxidase, a metallothionein, a 
lectin, or another senescence-related gene. In a particularly preferred 
embodiment, these DNA molecules are the clones pBAN 3-33, pBAN 3-18, 
pBAN 3-30, pBAN 3-24, pBAN 1-3, pBAN 3-28, pBAN 3-25, pBAN 3-6, pBAN 
3-23, pBAN 3-32, and pBAN 3-46. In another preferred embodiment, the DNA 
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molecule of the present invention has a nucleotide sequence selected from the 
group consisting of SEQ ID NO: 1; SEQ ID NO: 2; and SEQ ID NO: 3. 

In general, the procedures for isolating the DNA encoding a protein 
according to the present invention, subjecting it to partial digestion, isolating 
5 DNA fragments, ligating the fragments into a cloning vector, and transforming a 
host are well known in recombinant DNA technology. Accordingly, one of 
ordinary skill in the art can use or adapt the detailed protocols for such procedures 
as found in Sambrook et al. (1989), Molecular Cloning: A Laboratory Manual, 
2nd. Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 3 volumes, 

10 or in any other manual on recombinant DNA technology. 

Once the gene encoding a protein of the present invention has been 
obtained from one species, it can serve as a hybridization probe to isolate 
corresponding genes from the other species by cross-hybridization under low to 
moderate stringency conditions. Such conditions are usually found empirically by 

15 determining the conditions wherein the probe specifically cross-hybridizes to its 
counterpart gene with a minimum of background hybridization. Nucleic acid 
hybridization is a well known technique and thoroughly detailed in Sambrook et 
al. 

As noted above, the DNA encoding the proteins of the present invention 
20 can be originally isolated using PCR. Corresponding DNAs from other species 
can also be isolated using PCR, and oligonucleotides for performing these 
subsequent PCR reactions can be optimized using the sequence information 
obtained from DNA cloned from the first species. 

Moreover, peptides and fragments as well as chemically modified 
25 derivatives of the proteins of the present invention are also contemplated by the 
present invention. Briefly, any peptide fragment, derivative or analog which 
retains substantially the same biological activity of the protein of the present 
invention, and is differentially produced during fruit ripening, is contemplated. 
An analog may be defined herein as a peptide or fragment which exhibits the 
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biological activity of the protein of the present invention, and which is 
differentially expressed during fruit ripening, but which has an amino acid 
substitution, insertion or deletion in comparison to the wild-type protein. Such an 
analog can be prepared by the "conservative" substitution of an amino acid having 
5 similar chemical properties. One of ordinary skill in the art can readily identify 
suitable substitions. 

Thus, it should also be appreciated that also within the scope of the present 
invention are DNA sequences encoding a protein according to the present 
invention having the same amino acid sequence as the wild-type protein, but also 
10 those DNA sequences which are degenerate to the wild-type sequence. By 

"degenerate to" is meant that a different three-letter codon is used to specify a 
particular amino acid. It is well known in the art that the following codons can be 
used interchangeably to code for each specific amino acid: 



Amino Acid 


Abbrev. 


Codons 


Phenylalanine 


(Phe or F) 


UUU, UUC 


Leucine 


(Leu or L) 


UUA, UUG, CUU, cue, CUA, CUG 


Isoleucine 


(He or I) 


AUU, AUC, AUA 


Methionine 


(Met or M) 


AUG 


Valine 


(Val or V) 


GUU, GUC, GUA, GUG 


Serine 


(Ser or S) 


UCU, UCC, UCA, UCG, AGU, AGC 


Proline 


(Pro or P) 


ecu, CCC, CCA, CCG 


Threonine 


(Thr or T) 


ACU, ACC, AC A, ACG 


Alanine 


(Ala or A) 


GCU, GCG, GCA, GCG 


Tyrosine 


(Tyr or Y) 


UAU, UAC 


Histidine 


(His or H) 


CAU, CAC 


Glutamine 


(Gin or Q) 


CAA, CAG 


Asparagine 


(Asn or N) 


AAU, AAC 


Lysine 


(Lys or K) 


AAA, AAG 


Aspartic Acid 


(Asp or D) 


GAU or GAC 
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Amino Acid 
Glutamic Acid 
Cysteine 
Arginine 
Glycine 
Stop codon 



Abbrev. Codons 

(Glu or E) GAA or GAG 

(Cys or C) UGU or UGC 

(Arg or R) CGU, CGC, CGA, CGG, AGA, AGG 
(Gly or G) GGU, GGC, GGA, GGG 

UAA (ochre), UAG (amber), UGA (opal) 



It should be understood that the codons specified above are for RNA sequences. 
The corresponding codons for DNA have T substituted for U. 

Mutations can be made in the wild-type sequence such that a particular 
10 codon is changed to a codon which codes for a different amino acid. Such a 

mutation is generally made by making the fewest nucleotide changes possible. A 
substitution mutation of this sort can be made to change an amino acid in the 
resulting protein in a non-conservative manner (i.e., by changing the codon from 
an amino acid belonging to a grouping of amino acids having a particular size or 
15 characteristic to an amino acid belonging to another grouping) or in a conservative 
manner (i.e., by changing the codon from an amino acid belonging to a grouping 
of amino acids having a particular size or characteristic to an amino acid 
belonging to the same grouping). Such a conservative change generally leads to 
less change in the structure and function of the resulting protein. A non- 
20 conservative change is more likely to alter the structure, activity or function of the 
resulting protein. The following is one example of various groupings of amino 
acids: 

Amino acids with nonpolar R groups 
Alanine Proline 
25 Valine Phenylalanine 

Leucine Tryptophan 
Isoleucine Methionine 
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Amino acids with uncharged polar R groups 
Glycine Tyrosine 
Serine Asparagine 
Threonine Glutamine 
5 Cysteine 

Amino acids with charged polar R groups (negatively charged at Ph 6.0) 
Aspartic acid Glutamic acid 

10 Basic amino acids (positively charged at pH 6.0) 

Lysine Arginine 

Histidine (at pH 6.0) 
Another grouping may be according to molecular weight (i.e., size of R groups): 



15 



20 



Glycine 


75 


Aspartic acid 


133 


Alanine 


89 


Glutamine 


146 


Serine 


105 


Lysine 


146 


Proline 


115 


Glutamic acid 


147 


Valine 


117 


Methionine 


149 


Threonine 


119 


Histidine (at pH 6.0) 


155 


Cysteine 


121 


Phenylalanine 


165 


Leucine 


131 


Arginine 


174 


Isoleucine 


131 


Tyrosine 


181 


Asparagine 


132 


Tryptophan 


204 



25 Phenylalanine Tryptophan 

Tyrosine 
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Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be 
maintained; 

- Glu for Asp and vice versa such that a negative charge may be 
5 maintained; 

- Ser for Thr such that a free -OH can be maintained; and 

- Gin for Asn such that a free NHj can be maintained. 

Amino acid substitutions may also be introduced to substitute an amino 
acid with a particularly preferable property. For example, a Cys may be 

10 introduced at a potential site for disulfide bridging with another Cys. A His may 
be introduced as a particularly "catalytic" site (i.e., His can act as an acid or base 
and is the most common amino acid in biochemical catalysis). Pro may be 
introduced because of its particularly planar structure, which induces P-turns in 
the protein's structure. 

15 Purification of the proteins of the present invention from natural or 

recombinant sources can be accomplished by conventional purification means such 
as ammonium sulfate precipitation, gel filtration chromatography, ion exchange 
chromatography, adsorption chromatography, affinity chromatography, 
chromatofocusing, HPLC, FPLC, and the like. Where appropriate, purification 

20 steps can be done in batch or in columns. 

Peptide fragments of the proteins of the present invention can be prepared 
by proteolysis or by chemical degradation. Typical proteolytic enzymes are 
trypsin, chymotrypsin, V8 protease, subtilisin and the like; the enzymes are 
commercially available, and protocols for performing proteolytic digests are well 

25 known. Peptide fragments are purified by conventional means, as described 

above. Peptide fragments can often be identified by amino acid composition or 
sequence. Peptide fragments are useful as immunogens to obtain antibodies 
against the proteins of the present invention . 
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In accordance with the present invention, all or a part of a DNA molecule 
according to the present invention can be stably inserted in a conventional manner 
into the nuclear genome of a plant cell, and the so-transformed plant cell can be 
used to produce a transgenic plant showing improved expression of the DNA 
5 molecule according to the present invention. In this regard, a disarmed Ti- 
plasmid, containing a DNA molecule according to the present invention, in 
Agrobacterium (e.g., A. tumefaciens) can be used to transform a plant cell using 
the procedures described, for example, in EP 116.718 and EP 270,822, PCT 
publication 84.02913, EPA 87400544.0 and Gould et al. ((1991) Plant Physiol 
10 95: 426) which are incorporated herein by reference). Preferred Ti-plasmid 

vectors contain the foregoing DNA sequence between the border sequence, or at 
least located to the left of the right border sequence, of the T-DNA of the Ti- 
plasmid. 

A DNA molecule according to the present invention may also be 
15 operatively linked to a promoter capable of regulating the expression of said DNA 
molecule, to form a chimeric gene. Said chimeric gene may then be incorporated 
into a replicable expression vector, as described below, for use in transforming 
plants. The replicable expression vectors may also be used to obtain the 
polypeptides of the present invention by well known methods in recombinant DNA 
20 technology. 

Replicable expression vectors according to the present invention comprise 
a nucleic acid encoding the subject gene, i.e., the coding sequence is operably 
linked in proper reading frame to a nucleotide sequence element which directs 
expression of a protein of the present invention. In particular, the nucleotide 
25 sequence elements may include a promoter, a transcription enhancer element, a 
termination signal, a translation signal, or a combination of two or more of these 
elements, generally including at least a promoter element. 

Replicable expression vectors are generally DNA molecules engineered for 
controlled expression of a desired gene, especially where it is desirable to produce 
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large quantities of a particular gene product, or polypeptide. The vectors 
comprise one or more nucleotide sequences operably linked to a gene to control 
expression of that gene, the gene being expressed, and an origin of replication 
which is operable in the contemplated host. Preferably the vector encodes a 
5 selectable marker, for example, antibiotic resistance. Replicable expression 
vectors can be plasmids, bacteriophages, cosmids and viruses. Any expression 
vector comprising RNA is also contemplated. The replicable expression vectors 
of this invention can express the protein of the present invention at high levels. 
Many of these vectors are based on pBR322, M13 and lambda and are well known 
10 in the art and employ such promoters as trp, lac, Pl, T7 polymerase and the like. 
Hence, one skilled in the art has available many choices of replicable expression 
vectors, compatible hosts, and well-known methods for making and using the 
vectors. 

Other types of vectors can be used to transform plant cells, using 
15 procedures such as direct gene transfer (as described, for example, in EP 

233,247), pollen mediated transformation (as described, for example, in EP 
270,356, PCT publication WO 95/01856, and U.S. Patent No. 4,407,956), 
liposome-mediated transformation (as described, for example, in U.S. Patent No. 
4,5376,475) and other methods such as the methods for transforming monocots 
20 described in Fromm et al. ((1990) Bio/Technology 8:833) and Gordon-Kamm et 
aL((1990) Plant Cell 2:603). 

Preferably, the gene according to the present invention is inserted in a 
plant genome downstream of, and under the control of, a promoter which can 
direct the expression of the gene in the plant cells. Preferred promoters include, 
25 but are not limited to, the strong constitutive 35S promoter (Odell et al. (1985) 
Nature 313:810) of cauliflower mosaic virus; 35S promoter have been obtained 
from different isolates (Hull et al. (1987) Virology 86:482). Other preferred 
promoters include the TRl* promoter and the TR2' promoter (Velten et al.(1984) 
EMBO J. 3:1113) Alternatively, a promoter can be utilized which is not 
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constitutive but rather is specific for one or more tissues or organs. For example, 
a gene according to the present invention can be selectively expressed in the green 
tissues of a plant by placing the gene under control of a light-inducible promoter 
such as the promoter of the ribulose-l,5-phosphate-carboxylase small subunit gene 
5 as described in EPA 8300921 . 1. Another alternative is to use a promoter whose 
expression is inducible by temperature or chemical factors. 

It as also preferred that a gene according to the present invention be 
inserted upstream of suitable 3' transcription regulation signals (i.e., transcript 3* 
end formation and polyadenylation signals) such as the 3* untranslated end of the 

10 octopine synthase gene (Gielen et al.(1984) EMBO 7., 3:835-845) or T-DNA gene 
7 (Velten and Schell (1985) Nucl. Acids Res, 13:6981-6998). 

The resulting transformed plant of this invention expresses the inserted 
gene and is characterized by the production of high levels of the gene product. 
Such a plant can be used in a conventional breeding scheme to produce more 

15 transformed plants with the same improved phenotypic characteristics, or to 

introduce the gene into other varieties of the same or related plant species. Seeds, 
which are obtained from transformed plants, contain the gene as a stable genomic 
insert. 

The present invention further encompasses compositions comprising one or 
20 more proteins according to the present invention, and a carrier therefor. 

The present invention also provides isolated and purified banana DNA 
regulatory elements which are 5* or 3' to a gene which is differentially expressed 
during banana fruit development. In a preferred embodiment, said DNA 
regulatory elements are promoters. Said regulatory elements control the 
25 expression of genes to which they are operatively linked, and are senstitive to a 
plant development signal. In a preferred embodiment, the plant development 
signal is an ethyelene signal. The ethylene signal may be ethyelene gas released 
by ripening fruit, either naturally or stimulated artificially; alternatively, the 
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ethylene signal is produced by exposure of the plant or fruit to exogenous ethylene 
gas. 

The DNA regulatory elements of the present invention may be linked to 
native plant genes via homologous recombination, e.g., via the method of U.S. 
5 Patent 5,272,071, the contents of which are incorporated herein by reference. 
Alternatively, the DNA regulatory elements of the present invention may be 
operatively linked to a DNA molecule which is desired to be expressed in a plant 
in response to a development signal, thus forming a chimeric gene. 
Transformation of plants with such a chuneric gene, as described above, provides 
10 for controlled expression in fruit encoded by said DNA molecule. In a 

particularly preferred embodiment, said DNA molecule encodes a therapeutic 
protein. 

The DNA molecules of the present invention may be used to transform any 
plant in which expression of the particular protein encoded by said DNA 

15 molecules is desired. In addition, the regulatory elements of the present invention 
may be used to trigger gene expression in any plant in which gene expression is 
desired. Suitable plants for transformation with the DNA molecules and 
regulatory elements of the present invention include Banana (e,g,, Musa 
acuminata)', kiwiftuit (e.g., Actinidia deliciosa); grape (e.g., Vitis vinifera, V. 

20 labrusca, V. rotundifolia); peach, nectarine, plum, apricot, cherry, almond (e.g., 
Prunus persica, P. domestica, P, salicina, P. avium, P. cerasus, P. amygdalus); 
pear (e.g., Pyrus communis, P. pyrifolia.)\ apple {e.g., Malus x domestica); 
eggplant (e.g., Solanum melongena); tomato (e.g., Lycopersicon lycopersicum, L. 
esculentum)\ peppers (e.g., Capciscum sp.)\ peas and beans (e.g., Phaseolus 

25 vulgaris, P. lunatus, P. Limensis, Cicer arietimum, Vigna angularis, Pisum 

sativum. Glycine max)\ cucumbers, melons, squash and pumpkins (e.g., Cucumis 
melo, C. sativus, Citrullus lanatus, Cucurbita maxima, C. pepo); maize (e.g., Zea 
mays); rice (e.g., Oryza sativd); wheat; barley (e.g., Hordeum vulgare); tobacco 
(e.g., Nicotiana tabacum); potato (e.g., Solanum tuberosum); beet (e.g.. Beta 
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vulgaris); carrot (e.g., Daucus carota)\ parsnip {e.g., Pastinaca sativa); turnip, 
rutabaga (e.g., Brassica rapa, B. napus); and radish (e.g., Raphanus sativus). It 
will be understood that this is not an exclusive list, but merely suggestive of the 
wide range of utility of the DN A molecules and regulatory elements of the present 
5 invention. 

The present invention thus also provides a method for expression of 
heterologous protein in fruit comprising transforming fruiting plants with a 
chimeric gene, replicable expression vector, or plasmid comprising a ripening- 
associated promoter, as described above, exposing said fruit to an ethylene signal, 

10 and harvesting fruit containing said heterologous protein. The protein may be 
isolated from the harvested fruit using conventional methods, including those 
described above. Alternatively, where the protein is a therapeutic protein, in a 
preferred embodiment the fruit may be directly consumed by a patient in need of 
the therapeutic protein, thus providing for convenient oral administration of the 

15 protein. 

The following examples are presented in order to more fully illustrate the 
preferred embodiments of the invention. They should in no way be construed, 
however, as limiting the broad scope of the invention. 

20 EXAMPLE 1: Differential Gene Expression in 

Ripening Banana (Musa acuminata cv. Grand Nain) Fruit 

The experiments described in this example were designed to isolate those 
banana genes that are differentially expressed in ripening banana fruit. 

25 MATERIALS AND METHODS 
Plant Materials 

Ethylene treated and untreated banana fruit (Musa acuminata cv. Grand 
Nain) were obtained from the Northside Banana Company (Houston, TX). The 
pulp and peel of fruit representing each of the seven different stages of ripening 
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(PCI 1 through 7) were separated and quick-frozen in liquid nitrogen. Tissues 
from ten individual fruit were pooled to obtain a uniform representative sample for 
each ripening stage and ground to a fine powder under liquid nitrogen in a 
stainless steel Waring blender. Ground samples were stored at -80 °C until 
5 utilized. Leaf, corm and root tissue were obtained from greenhouse-grown plants 
(cv Grand Nain), ground in liquid nitrogen using a mortar and pestle, and stored 
at -80^C. 

RNA Isolation 

10 Pre-warmed (65°C) RNA extraction buffer (1.4% (w/v) SDS, 2% (w/v) 

polyvinylpyrrolidone, 0.5 M NaCl, 0. IM sodium acetate, 0.05 M EDTA, pH 
8.0, 0.1 % (v/v) P-mercaptoethanol) was added to previously ground samples of 
pulp from PCI 1 and PCI 3 at a 5: 1 tissue to buffer ratio. Samples were 
homogenized with two or three 30 second pulses of a Polytron tissue homogenizer 

15 (Brinkman) and incubated at 65 ''C for 15 min. Starch and other cell debris were 
pelleted by centrifugation at 2,400g for 10 min at room temperature and the 
supernatant transferred to a disposable 50 ml polypropylene screw-cap tube. After 
the addition of 0.2 vol. of 5 M potassium acetate, pH 4.8, samples were mixed by 
inversion and incubated on ice for 30 min. The resulting precipitate was pelleted 

20 by centrifugation at 20.2k rpm for 10 min at 4 °C in a Sorvall SW28 rotor. The 

supernatant was transferred to a disposable polypropylene centrifuge tube, and the 
high-molecular weight RNA was precipitated by the addition of lithium chloride to 
a final concentration of 2.5 M and incubation overnight at 4*'C. RNA was 
isolated from leaf and root tissues using a CTAB isolation buffer modified from 

25 Doyle and Doyle (1987). Root and leaf tissues were ground to a powder in liquid 
nitrogen using a mortar and pestle. Five grams of frozen powder were added to 
10 ml of prewarmed (65 °C) CTAB RNA extraction buffer (100 mM Tris-Borate, 
pH 8.2, 1.4 M NaCl, 20 mM EDTA, 1% (w/v) CTAB (hexadecyltrimethyl- 
ammonium bromide), 0.1% (v/v) p-mercaptoethanol). Samples were 
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homogenized with two or three 30 second pulses of a Polytron tissue homogenizer 
(Brinkman), and the homogenate was incubated at 65 °C for one hour. Samples 
were cooled to room temperature, extracted twice with an equal volume of 
chloroform, and the phases were separated by centrifiigation. Following 
5 centrifiigation, lithium chloride was added to a final concentration of 2M, and 
RNA was allowed to precipitate overnight at 4°C. RNA was pelleted at 4°C for 
20 min at 20kg, washed with 70% ethanol and re-suspended in DEPC-treated 
HjO. The RNA was phenol: chloroform (1:1) extracted and ethanol precipitated. 

10 

cDNA Library Construction 

Pulp PCI 1 and 3 cDNA libraries were generated using poly(A) + mRNA 
prepared from total RNA using a magnetic bead separation protocol (Dynal) 
according to the manufacturer's instructions. Lambda Zap cDNA libraries were 
15 generated according to the supplier's protocol (Stratagene). 

Dijferential Screening 

Approximately 5 x 10"* plaque-forming units (pfu) from each cDNA library 
were plated onto LB plates using the appropriate E. coll host strain. Duplicate 

20 plaque-lifts were generated by placing Nytran nylon filters (Schleicher and 

Schuell) onto plaque-containing plates for one and four minutes for the first and 
second filters, respectively. Filter-bound DNA was denatured for two min in 1.5 
M NaCl, 0.5 M NaOH, and neutralized for four minutes in L5 M NaCl, 0.5 M 
Tris (pH 8.0). Filters were rinsed in 0.5 M Tris (pH 8.0), blotted dry, and UV 

25 crosslinked (Stratalinker, Stratagene). 

Labeled first-strand cDNA probes used in the differential screening were 
synthesized from 15 mg total RNA in the presence of L5 /xm [a-[^^P] dCTP (3000 
mCi/mmol) using an oligo(dT)i5, primer (Promega) and 15U MMLV reverse 
transcriptase according to the manufacturer's instructions (Promega). The mRNA 
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template was removed by hydrolysis in 100 mm NaOH at 65 "^C for 30 min. The 
reaction was neutralized in 100 mm Tris-HCl (pH 8.0), and the labeled first-strand 
cDNA was ethanol precipitated in the presence of 20 fig of carrier yeast tRNA. 

Filters were pre-hybridized for 30 min in 1 mM EDTA, 0.25 M phosphate 
5 buffer (pH 7.2), 7% (w/v) SDS, and hybridized overnight at 65°C in the same 
solution containing the denatured probe (1 X 10'' cpm/ml). Hybridized filters 
were washed twice for 30 min each at 65''C in Wash Solution One (1 mM EDTA, 
40 mM phosphate buffer, pH 7.2, 5% (w/v) SDS) and three times for 30 min each 
at 65°C in Wash Solution Two (1 mM EDTA, 40 mM phosphate buffer pH 7,2, 

10 1 % (w/v) SDS). The air-dried filters were subjected to autoradiography (X-Omat 
X-ray film, Kodak) for 72h at -SO^'C with an intensifying screen. 

Banana pulp cDNA libraries from PCI 1 and PCI 3 were each probed 
separately and differentially with labeled cDNA from pulp at PCI 1 and PCI 3. 
Plaques which demonstrated strong differential signal intensities between both 

15 probes were selected as positives. Positive plaques were then subjected to 

secondary screening to purify single isolates by utilizing the same probes as in the 
primary screening. pBluescript phagemids were excised from the isolated plaques 
according to the manufacturer's recommendations (Stratagene). 

20 Sequence Analysis 

Small-scale alkaline lysis plasmid preparations followed by 
phenol: chloroform extraction and ethanol precipitation (Sambrook et aL, 1989) 
yielded template plasmid DNA suitable for automated sequencing. Plasmid DNA 
templates were sequenced, using the T3 primer, on an ABI 373 A DNA sequencer 

25 (Applied Biosystems, Foster City, CA). Vector and 3* poly(A) residue sequences 
were removed from the output sequence. Edited sequences were loaded into the 
NCBI form for BLAST (9.1) searching on a network server 
(www.ncbi.nlm.nih.gov), and searches were performed using the default settings 
of BLASTN (Altschul et al., 1990). For some cDNA clones, no significant 
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homology (defined as a high score above 100) with sequences in the databases was 
identified using BLASTN. In that event, the default settings of the BLASTX 
search, an algorithm that translates the nucleic acid sequence in all six frames and 
searches a non-redundant amino acid database for matches, were used (Gish and 
5 States, 1993). 

Dot-blot Hybridization 

Comparisons of the relative transcript abundance of the individual cDNA 
clones between PCI 1 , 3 and 5 pulp were made through dot-blot hybridization 

10 experiments. Plasmids containing the cDNA inserts were affixed to nylon 

membrane and hybridized with first-strand cDNA from generated from PCI 1 , 3 
or 5 pulp RNA. The equivalent of 1 x 10^^ copies of each plasmid (approximately 
0.5 /ig of plasmid DNA containing a Ikb cDNA insert) was heat denatured (95 °C 
for 10 min), and quenched on ice. Using a vacuum dot-blot apparatus (BioRad), 

15 target DNA was applied to HyBond N-f nylon membrane (Amersham). 

Membranes were air-dried, UV crosslinked (Stratalinker), and hybridized as 
described above using 2x10^ cpm/ml of PCI 1,3, and 5 radiolabeled first strand 
cDNA as probe. Following hybridization, membranes were exposed to a 
phosphorescent screen (Phosphorlmager, Molecular Dynamics) and the scanned 

20 image was analyzed with the ImageQuant quantitation software. 

Northern Analyses 

Total RNA was isolated from banana pulp and peel at PCI 3, and from 
root, corm, and leaf tissues of greenhouse-grown Grand Nain banana plants. Ten 
25 micrograms of each of the RNA samples were separated by electrophoresis 
through formaldehyde-containing agarose gels and transferred to Nytran Plus 
nylon membrane (Schleicher and Schuell) using a vacuum transfer apparatus 
(BioRad) according to the manufacturer's recommendations. Equal RNA loading 
was confirmed by staining the RNA-containing nylon membranes with methylene 



wo 99/15668 



PCT/US98/03343 



-26- 

blue (Sambrook et aL,1989), The RNA blots were hybridized with a cDNA probe 
representing the largest isolate from each of the eleven nonredundant groups of 
clones. DNA probes were synthesized using the Rad-Prime DNA Labeling 
System (Gibco BRL), and hybridized as described above. 

5 

RESULTS 

Differential screening of approximately 10^ plaques with labeled pulp 
cDNAs resulted in the identification of approximately 100 plaques with a signal 
intensity sufficient to be detected by autoradiography after a 72 hour exposure to 
X-ray film. It was apparent from the signal intensities observed between 
differentially hybridized plaque lifts that the relative abundance of a number of 
transcripts changed between PCI 1 and 3, A total of 38 cDNA clones were 
isolated from banana pulp libraries by differential screening. 

Sequence alignment and homology searches indicate that eleven non- 
redundant groups of cDNAs were identified (Table 1). Using sequence 
homology, BLAST searches were able to assign, with high scores between 167 
and 1294, a putative identity for all clones. Amino acid sequence homology 
searches using the BLASTX algorithm were necessary to assign an identity to the 
clones encoding the putative lectin and senescence-related protein. According to 
the results of the sequence homology searches, all of the banana sequences are 
more similar to other plant genes than to genes from other organisms. There were 
many redundant isolates, especially of the smaller cDNAs such as those encoding 
the different metallothioneins. Ten clones encoding a putative chitinase, an 
especially abundant protein in banana pulp (R. L6pez-G6mez, unpublished data), 
were isolated. 

Relative abundance among the different transcripts was estimated by 
hybridizing isotopically labeled first-strand cDNA to an excess of cloned cDNA 
which was previously dot-blotted onto nylon membrane. This technique also 
allowed for the confirmation of differential expression of these transcripts in pulp 
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between PCI 1 and 3, and at a later stage of ripening, PCI 5 (Figure 1). Relative 
transcript abundance of starch synthase, GBSS, chitinase, and a type 2 
methallothionein decreased in pulp between PCI 1 and 3, and continued to 
decrease through PCI 5. There was a peak in the abundance of several of the 
5 transcripts in PCI 3 pulp, including endochitinase, glucanase, thaumatin, ascorbate 
peroxidase, and metallothionein. The differential expression of these banana 
transcripts before and after the peak in ethylene biosynthesis indicates that the 
transcripts that increase in abundance between PCI 1 and PCI 3 respond to 
ethylene. The differential expression of the eleven different groups of cDNAs in 
10 banana pulp between ripening stages PCI 1 and 3 was confirmed by Northern 

analyses (data not shown). Results from the dotblot hybridization were also used 
to estimate relative abundance of each class of cDNA in the pulp of ripening 
banana ftuit, with thaumatin and P-l,3-glucanase being the first and second most 
abundant transcripts, respectively (Figure 1). 
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Although these cDNAs are relatively abundant in the pulp of banana fruit, 
their patterns of expression are not limited to these tissues. Northern analyses 
indicate that starch synthase, GBSS, and chitinase transcripts were abundant in 
pulp and corm tissues, and present in peel. Expression of the endochitinase, 
5 thaumatin-like protein, and p-1,3 glucanase transcripts was limited to the pulp and 
peel of the fruit. Both classes of metallothionein transcripts were expressed in all 
tissues analyzed, but were most abundant in the pulp and peel. In comparison, 
MT was more abundant in leaves than Type-2 MT, while the converse was 
observed in the corm. Lectin transcripts were most abundant in pulp and root 

10 tissues, whereas the ascorbate peroxidase and senescence-related protein 
transcripts were ubiquitously expressed. 

Many of the physiological changes that occur during banana fruit ripening 
are in response to ethylene produced in the pulp (Don-Tinguez and Vendrell, 
1993; Burdon et al., 1994). In addition, ethylene also serves as a signal for other 

15 physiological changes including senescence. The cDNA clones identified in this 
study were isolated by differential screening at stages of fruit ripening 
corresponding to periods before and after the peak in ethylene biosynthesis 
(Agravante et al., 1991). Therefore, it is likely that some of the transcripts that 
increase in abundance between those stages of ripening may be regulated by 

20 ethylene, even if they do not have a direct role in the ripening process. The role 
of ethylene in the regulation of PR proteins (glucanase, chitinase, endochitinase, 
thaumatin) has been well documented. Ethylene is also believed to influence 
expression of ascorbate peroxidase (Mehlhorn, 1990) and metallothionein (Coupe 
etal., 1995) 

25 
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EXAMPLE 2: The Abundant 31-Kilodalton Banana Pulp Protein is 
Homologous to Class-Ill Acidic Chitinases 

The experiments described in this example were designed to identify and 
characterize the abundant 31kD protein from the pulp of banana fruit (Musa 
5 acuminata cv. Grand Nain), and to isolate a cDNA encoding this protein. 

MATERIALS AND METHODS 
Plant Materials 

Ethylene treated and untreated banana fruit (Musa acuminata cv. Grand 
10 Nain) were obtained from the Northside Banana Company (Houston, TX). The 
pulp and peel of fruit representing each of the seven different stages of ripening 
(PCI 1 through 7) v/crc separated and quick-frozen in liquid nitrogen. Tissues 
from ten individual fruit were pooled to obtain a uniform representative sample for 
each ripening stage and ground to a fine powder under liquid nitrogen in a 
15 stainless steel Waring blender. Ground samples were stored at -80''C until 

utilized. Other banana tissues were obtained from greenhouse-grown plants (cv 
Grand Nain). 

Protein Isolation for Antiserum Production, N-terminal Sequencing, and Western 
20 Blotting 

Soluble banana pulp proteins were differentially precipitated from pulp 
extracts with anunonium sulfate. P31 was most abundant in the 40 to 60% 
anunonium sulfate fraction, as determined by SDS-PAGE separation (Laemmli, 
U.K. (1970) Nature 227:680), followed by Coommassie blue staining (Sambrook 
25 et al. (1989) Molecular Cloning, a Laboratory Manual, Ed. 2 Cold Spring Harbor 
Press, Plainview, NY). The 31 kDa protein band was excised from the gel, 
homogenized and used to immunize a rabbit for antiserum production, according 
to standard protocols. In addition, proteins from the 40 to 60% ammonium sulfate 
fraction were separated by SDS-PAGE and transferred PVDF protein sequencing 
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membrane and stained with Coommassie blue. The stained 31 kDa protein band 
was excised from the membrane and the N-terminal sequence was determined. 

Total protein isolated from banana root, corm, leaf, meristem, peel, and 
pulp at different stages of ripening were separated by SDS-PAGE and 
5 electrophoretically transferred to PVDF membrane. The membranes were 

incubated with the primary antiserum at 1:500 dilution, and the bound antibodies 
were visualized using chemiluminescence. 



Northern Blot Analyses 

10 Total RNA was isolated from banana leaf, corm, root, peel, and floral 

structures and from banana pulp at PCI 1 through 7 (Lopez-Gomez, R., et al. 
(1992) 5:440). Agarose gel electrophoresis, and northern blotting and 
hybridization were performed according to standard protocols (Sambrook et al., 
supra). The cDNA clone pBAN3-30 was radiolabeled with ^^P-dCTP by random 

15 priming and used as a probe. 



pBAN3-30 Isolation and Sequence Analysis 

pBAN3-30 was isolated from a banana pulp cDNA library by differential 
screening (Clendennen, S.K. et al. (1997) Plant Physiology), The complete 

20 sequence of the cDNA insert was determined on both strands, and the open 
reading frame was translated. Sequence homology of pBAN3-30 and the 
translation product (P31) were determined using the BLAST search algorithm for 
searching GenBank (Altschul, S.F., et al (1990) J. Molec. Biol 215:403). For 
the amino acid alignments, plant chitinase sequences showing significant 

25 homology to P31 were downloaded from GenBank and aligned manually. 

Expression of Recombinant P31 

A total of ten homologous chitinase clones were isolated from the banana 
pulp cDNA library by differential screening, including pBAN3-30, pBAN3-31, 
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pBAN3-36, and pBAN3-45 (Clendennen et al., supra). These four clones were 
used for the expression of P31 for western blot analysis of the translation 
products. It was determined that pBAN3-36 and pBAN3-45 contained chitinase 
coding sequences that were in-frame with respect to P-galactosidase in the 
5 pBluescript cloning vector. AH four of the cDNA clones, in E. coli XL 1 -blue 

host cells, were grown to log phase in selective media and then induced by IPTG. 
Total bacterial proteins were separated by SDS-PAGE and transferred to PVDF 
membrane. The western blot was hybridized with P31 antiserum and visualized 
using chemiluminescence. 

10 

RESULTS 

P31 Isolation and Tissue Distribution 

SDS-PAGE analysis of total proteins isolated from pulp of banana fruit at 
seven ripening stages indicated changes in abundance of several proteins (Figure 

15 1). The most abundant protein during the pre-climacteric stage (Peel Color Index 
or PCI 1 and 2) is a 31 kDa protein (P31) which seemed to decrease slightly in 
abundance as ripening proceeded (Figure 3). This protein (P31) was partially 
purified by a combination of ammonium sulfate precipitation and separation by 
SDS-PAGE. Polyclonal antiserum was raised against the purified protein. The 

20 P31 antiserum recognizes a single 31 kDa polypeptide in banana pulp that is not 
present in banana peel, meristem, corm, or root tissue (Figure 4). These results 
indicate that P31 is fruit-specific. 

The N-terminus of the partially purified protein was sequenced and the 
resultant 20-amino acid sequence is: GRNSCIGVYWGQKTDEGSLA (data also 

25 appear in Figure 7). A search of the amino acid sequence databhase (GenBank) 

revealed that the N-terminus of P31 shares significant homology to amino-terminal 
peptide sequences from purified acidic chitinases of Mongolian snake-gourd 
{Trichosanthes kirilowii; see Savary et al. (1994) Plant Physiol 106:1195) and 
chick pea {Cicer arietinum; see, Vogelsgang, R., et al. (1993) Planta 189:60). 
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P31 Expression in Ripening Pulp 

P3 1 expression in banana pulp during ripening was investigated at the 
protein and transcript levels. Western blot analysis of banana pulp proteins 
isolated at each of seven chronological stages of ripening (Figure 5, top panel) 
5 indicates that P31 decreases in relative abundance during ripening, consistent with 
the observations of P31 abundance after separation by SDS-PAGE and staining 
with Coomassie blue. Using differential screening, several ripening-associated 
genes were isolated from a banana pulp cDNA library, including clones with 
significant homology to chitinases (Clendennen et al., supra). For determination 

10 of relative chitinase transcript abundance during ripening, total RNA was isolated 
from banana pulp during ripening, at PCI 1 through 7, and probed with labeled 
pBAN3-30. Northern blot analysis (Figure 5, bottom panel) shows that the P31 
message is strongly expressed ruing the first few ripening stages (PCI 1 through 3) 
after which the chitinase transcript declines in banana pulp through the later stages 

15 of ripening. This observation is consistent with the result obtained through 
western analysis. Northern and western blot analysis together suggest that 
expression of P31 is both fruit-specific and developmentally regulated in banana. 
While both the P31 protein and the chitinase transcript are abundant during the 
pre-climacteric stages of fruit ripening (PCI 1 through 3), their relative levels 

20 decrease as ripening progresses. 



pBAN3'30 Encodes P31 

Three lines of evidence lead us to conclude that pB AN3-30 encodes the 
abundant 31 kDa pulp protein. First, the expression pattern of the pBAN3-30 
25 transcript during ripening matches very closely with the profile of P3 1 abundance 
during ripening as determined by western blot analysis using the P31 antiserum, as 
seen in Figure 5. Second, the P31 antiserum recognizes the translation product of 
the chitinase cDNA insert. The translation products of the cDNA clones pBAN3- 
36 and pBAN3-45, which are homologous to pBAN3-30 but have been determined 



BNSDOCID: <WO 991 5668A2_I_> 



wo 99/15668 



PCT/US98/03343 



-34- 

to be in-frame with respect to the P-galactosidase gene in the pBluescript cloning 
vector (Stratagene), were expressed as fusion proteins with p-galactosidase. 
These fusion proteins were analyzed by western blotting and incubation with the 
P31 antiserum. The P31 antiserum recognizes a 35 kDa polypeptide produced in 
5 the IPTG-induced bacterial cells containing an in-frame chitinase cDNA (pBAN3- 
36 and pBAN3-45) that is not present in cell extracts from bacteria containing only 
the pBluescript plasmid (no insert) or out-of-frame chitinase cDNA inserts 
(pBAN3-30 and pBAN3-31) (Figure 6). Finally, the N-terminal amino acid 
sequence obtained from the purified protein, which is underlined in Figure 7, is 

10 identical to the deduced amino acid sequence of pBAN3-30 at 17 of 20 residues. 
This match is improved when the first amino acid residue, which is usually 
considered to be uncertain, is discounted. Despite the high sequence homology, 
the amino acid sequence from the partially purified protein is not completely 
identical to the amino acid sequence deduced from the cDNA clone pBAN3-30. It 

15 is possible that a contaminating polypeptide co-migrated with P31 and influenced 
the amino acid sequence results. Alternatively, it is possible that P31 is encoded 
by a gene family in banana, members of which are highly homologous, though not 
identical, and cannot be distinguished from one another by northern or western 
analyses. 

20 

Sequence Analysis ofpBAN3-30 

The complete nucleotide sequence of pBAN3-30 and the deduced amino 
acid sequence of the translation product is shown in Figure 7. The cDNA insert is 
1186 bp long and includes the entire chitinase coding region. The ATG beginning 
25 at position 55 is likely to be the translation initiation codon because the nucleotide 
sequence flanking the first ATG codon matches 8 of the 12 bases in the consensus 
for translation start sites in plants (Joshi, CP. (1987) NucL Acids Res. 15:6543), 
whereas the sequences flanking another potential in-frame downstream start site 
(at position 100) is identical at only 5 of the 12 bases. There is an in-frame 
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termination codon at position 1024 and several putative polyadenlyation signals 
between positions 1136 and 1148. 

The open reading frame spans 323 amino acids from v^hich a translation 
product of 35,232 Da is predicted. A GenBank search using the full cDNA 
5 sequence reveals significant homology between pBAN3-30 and chitinase genes 
characterized from winged bean {Psophocarpus tetragonolobus , M Esaka and T. 
Teramoto, unpublished), cow pea (Vigna unguiculata, L.T.T. Vo et al., 
unpublished), azuki bean (Vigna angularis; see, Ishige, F., et al. (1993) Plant Cell 
PhysioL 34:103), maize {Zea mays; see, Didierjean, L., et al. (1996) Planta 

10 199:1), and chick pea {Cicer arietinum; see, Vogelsgang, R., et al. (1993) Plant 
PhysioL 103:297). The deduced amino acid sequence of pBAN3-30 encoding P31 
in banana shares sequence homology with other plant chitinases, especially with 
class III acidic chitinases that have been characterized from various dicots. At the 
amino acid level, the banana acidic chitinase amino acid sequence shows 

15 significant homology, 47-53% identity, to acidic chitinases from Arabidopsis 
thaliana (Samac, D.A., et al. (1990) Plant PhysioL 93:907), wine grape (Vitis 
vinifera, Busam et al, unpublished), tobacco (Nicotiana tabacum; see, Lawton, K. 
et al. (1992) Plant Molec, Biol. 19:735), chickpea, sugar beet (Beta vulgaris; see, 
Nielsen, K.K., et al. (1993) Molec. Plant-Microbe Interact. 6:495), winged bean, 

20 and cucumber (Cucumis sativus; see, Lawton, K.A. et al. (1994) Molec, Plant- 
Microbe Interact. 7:48). 

An amino acid sequence alignment of the amino-terminal and carboxy- 
terminal regions of several plant acid chitinases with P3 1 from banana appears in 
Figure 8. Hydrophilicity analysis of the deduced protein sequence of P31 reveals 

25 a hydrophobic region from amino acid 1 to 25 (underlined in Figure 8A). This 
region may represent a signal sequence that would direct targeting to the ER. If 
this putative signal peptide is removed, the remaining sequence closely matches 
the N-terminal sequence obtained from the purified protein, which suggests that 
P31 is post-translationally processed. This signal peptide does not share 
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significant homology with the signal peptide sequences of other plant class III 
acidic chitinases (see Figure 8A), which are typically localized to the extracellular 
space (Punja, Z.K. et al. (1993) /. NematoL 25:526; CoUinge, D.V., et al. (1993) 
Plant J. 3:31; Lawton, K. et al. (1992) Plant Molec. Biol 19:735; Graham, L.S., 
5 et al. (1994) Canad. J. Botany 72:1057; Bol, J.F. (1990) Ann. Rev, PhytopathoL 
28:113-138). 

In addition to the N-terminal signal peptide, the banana P31 sequence is 
distinguished from other chitinase sequences by the presence of an 19 amino acid 
C-terminal extension (underlined in Figure SB). C-terminal propeptides (CTPPs) 

10 have been identified in a number of monocot and dicot polypeptides that direct 
proteins to the plant vacuole. Among others, CTPPs have been characterized in 
vacuolar lectins from barley and rice, and from vacuolar p-l,3-glucanase and 
chitinase from tobacco {see, Bednarel, S.Y. (1992) Plant Molec. Biol 20:133, for 
review). In general there is little sequence homology among plant vacuolar 

15 targeting sequences. However, weak homology can be detected between the C- 
terminal extension of P31 (SNILSMP) and vacuolar targeting sequences that have 
been characterized in the sweet potato storage protein sporamin (NPIRLP) 
(Linthorst, HJ.M. (1991) Crit, Rev. Plant Sci 10:123) and in a 2S albumin from 
Brazil nut (NLSPMRCP) (Saalbach, G. et al. (1996) Plant Physiol 112:975). 

20 Based on amino acid sequences, chitinases can be grouped into four 

classes. Class I includes a majority of chitinases described, possessing an N- 
terminal cysteine-rich lectin or "hevein" (chitin-binding) domain and a highly 
conserved catalytic domain. Class II chitinases lack the N-terminal cysteine-rich 
domain but have a high amino acid sequence identity to the main structure of class 

25 I chitinases. Class III chitinases show little sequence similarity to plant enzymes 
in class I or II, but may in fact be more similar to bacterial chitinases. The 
majority of class III chitinases are classified as such on the basis of homology to 
previously described lysozymes with chitinase activity. Class FV chitinases 
contain a cysteine-rich domain and conserved main structure which resemble those 
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of class I chitinases by are significantly smaller due to four deletions (Punja, Z,K., 
et al. (1993) J. NematoL 25:526; CoUinge, D.V., et al. (1993) Plant J. 3:31; 
Graham, L.S,, et al. (1994) Canad, J. Botany 72:1057). Although the banana 
pulp chitinase shares significant sequence homology with other plant class III 
5 acidic chitinases, the predicted isoelectric point of P31 is 7.63 (neutral). In 
addition, studies to determine the chitinase active sites in bacterial chitinases 
appear to be conserved in plant, bacterial, and fungal sequences (Perlick, A.M., et 
al. (1996) Plant Physiol, 110: 147). At least five highly conserved amino acids 
have been shown to be necessary for chitinase activity, and the deduced amino 
10 acid sequence of P31 indicates that only three of the five amino acids necessary 
for chitinase activity are conserved in banana P31 (not shown) (Watanabe, T., et 
al. (1993) J. BioL Chem. 268:18567; Tsujibo, H., et al. (1993) Biosci. Biotech. 
Biochem. 57:1396). 

Role of chitinase in banana pulp 

In plants, class III chitinases have been reported to be induced in response 
to various stresses such as pathogenesis and wounding (Ishige, F., et al. (1993) 
Plant Cell Physiol. 34:103; Lawton, K., et al. (1992) Plant Molec, BioL 19:735; 
Nielsen, K.K., et al. (1993) Molec. Plant-Microbe Interact. 6:495; Lawton, K.A., 
et al. (1994) Molec. Plant-Microbe Interact. 7:48; Mehta, R.A., et al. (1991) 
Plant Cell Physiol. 32:1057). Recently, it has been reported that the expression of 
several pathogenesis and stress-related proteins, including chitinases, is associated 
with fruit ripening. Several genes encoding pathogenesis-related proteins such as 
endochitinase are associated with ripening in banana pulp (Clendennen, S.K., et 
al. (1997) Plant Physiol.). Considering the antifungal activity that they exhibit in 
other plants, it is possible that chitinases fulfill a protective role during fi:iiit 
development and ripening. However, in contrast to the ripening-associated PR- 
proteins studied in cherry, avocado, and tomato, banana P31 decreases in 
abundance during ripening. Although it is possible that the banana chitinase 
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serves a protective role during fruit development, an alternate hypothesis is that 
the chitinase in banana pulp has been recruited to serve as a storage protein in this 
tissue. 

Storage proteins are a heterogeneous group of proteins for which no 
5 defined assay is available. According to a recent review (Staswick, P.E. (1994) 
Ann, Rev. Plant Physiol, Plant Molec, Biol, 45:303), storage proteins generally 
share the features listed below; we relate traits of P31 to general features of 
storage proteins. 

1) Storage proteins are very abundant. We have found P31 to be very 

10 abundant in unripe banana pulp, accounting for approximately 20 to 30% of total 
soluble pulp protein. 2) Storage proteins are preferentially degraded during a 
subsequent developmental stage. For example, a vegetative storage protein 
characterized from the bark of poplar trees accumulates during fall and winter and 
is degraded during shoot growth in the spring. P31 is preferentially degraded 

15 during banana fruit ripening. Both the transcript and protein abundance decrease 
during ripening. If P31 is indeed a storage protein in banana pulp, this 
preferential degradation implies the existence of a protease specific to the storage 
protein, and inhibition of the protease would inhibit degradation of the storage 
protein. 3) Storage proteins are generally localized in storage vacuoles within the 

20 cell. The sub-cellular localization of P31 has not yet been determined. According 
to the deduced amino acid sequence of P3 1 , there is a putative signal peptide 
sequence for P31 that is 25 amino acids long and hydrophobic. In addition, the 
amino acid sequence of P31 from banana pulp is distinguished from other plat 
class III acid chitinases by the presence of an 18 amino acid C-terminal extension 

25 that shows some homology to previously characterized C-terminal vacuolar 
targeting signals, suggesting vacuolar localization of P31. 4) Many storage 
proteins contain a large proportion of amino acid residues with nitrogen- 
containing R-groups. Amino acid composition analysis of P31 indicates that 22% 
of residues have N-containing R-groups (Trp, Gin, Asn, Lys, Arg, His). This is 
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approximately the same proportion of N -containing amino acids in vegetative 
storage proteins from soybean and poplar (21-25%). Interestingly, the amino acid 
composition of P3 1 is not significantly higher than the N- of other plant chitinases 
(17-23 %). 5) Storage proteins typically lack any other metabolic or structural 
5 role. However, this is not necessarily true for soybean vegetative storage protein, 
which has retained a minimal acid phosphatase activity, and patatin, a potato tuber 
storage protein that exhibits residual lipid acyl hydrolase activity. Preliminary 
evidence suggests that protein extracts from banana pulp have very low chitinase 
activity, as measured by soluble chitobiose released from radiolabeled chitin. In 
10 addition, only three of the five amino acids which have been determined to be 

essential for chitinase activity are conserved in P31. Taken together, this evidence 
lends support to the hypothesis that P31, while sharing sequence homology with 
plant chitinases, may actually be serving as a storage protein in banana pulp. 



15 

EXAMPLE 3: A Novel Fruit-Associated Class of Metallothionein-Like 
Proteins from Banana (Musa acuminata cv Grand nain): 
Characterization of the gene family and induction by H2O2 
In the experiments described in this Example, the gene family encoding the 
20 fruit-associated MTs is characterized, and sequence and functional evidence is 
provided that at least one member functions as an antioxidant during fruit 
ripening. 



MATERIALS AND METHODS 
25 Plant Materials 

Ethylene treated and untreated banana fruit (Musa acuminata cv. Grand 
Nain) were obtained from the Northside Banana Company (Houston, TX). The 
pulp and peel of fruit representing different stages of ripening (PCI 1 and 3) were 
separated and quick-frozen in liquid nitrogen. Tissues from ten individual fruit 



BNSDOCID: <WO 9915668A2_I_> 



wo 99/15668 



PCT/US98/03343 



-40- 

were pooled to obtain a uniform representative sample for each ripening stage and 
ground to a fme powder under liquid nitrogen in a stainless steel Waring blender. 
Ground samples were stored at -SO^^C until utilized. Leaf, corm and root tissue 
were obtained from greenhouse-grown plants (cv Grand Nain), ground in liquid 
5 nitrogen using a mortar and pestle, and stored at -80° C. 

RNA Isolation and Northern Blotting 

Pre-warmed (65°C) RNA extraction buffer (1.4% (w/v) SDS, 2% (w/v) 
polyvinylpyrrolidone, 0.5 M NaCl, O.IM sodium acetate, 0.05 M EDTA (pH 8.0) 
10 0.1% (v/v) p-mercaptoethanol) was added to previously ground samples of pulp at 
a ratio of 5 ml buffer per gram of tissue. Samples were homogenized with several 
short bursts of a tissue homogenizer (Polytron, Brinkman) and incubated at 65°C 
for 15 min. Starch and other cell debris were pelleted by centrifugation at 2,400g 
for 10 min at room temperamre and the supernatant transferred to a disposable 
15 polypropylene tube. After the addition of 0.2 vol. of 5 M potassium acetate (pH 
4.8), the samples were mixed and incubated on ice for 30 min. The resulting 
precipitate was pelleted by centrifugation at 20,200 rpm for 10 min at 4°C in a 
Sorvall SW28 rotor. The supernatant was transferred to a disposable 
polypropylene centrifuge tube, and the high-molecular weight RNAs were 
20 precipitated by the addition of lithium chloride to a final concentration of 2.5 M 
and incubation overnight at 4*^C. 

RNA was extracted from previously frozen ground peel, root, leaf, and 
corm tissue using CTAB extraction. 

Five micrograms of total RNA from root, corm, and leaf tissue of 
25 greenhouse-grown plants, and from peel and pulp (PCI 3) were separated by 
electrophoresis in formaldehyde-containing 2% agarose gels and transferred to 
nylon membrane (Nytran Plust, Schleicher and Schuell) using 20X SSPE as a 
transfer buffer and a vacuum transfer apparatus (Bio-Rad). Equal RNA loading 
was confirmed by staining the RNA on the nylon membranes with methylene blue 
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(Sambrook et al., 1989). RNA blots were prehybridized in 1 mM EDTA, 0.25 M 
phosphate buffer (pH 7.2), 7% (w/v) SDS, and hybridized overnight at eS^'C in 
the same solution containing the denatured probe (1 X 10^ cpm/ml). Hybridized 
filters were washed twice for 30 min each at 65°C in Wash Solution One [1 mM 
5 EDTA, 40 mM phosphate buffer (pH 7.2) 5% (w/v) SDS] and three times for 30 
min each at 65°C in Wash Solution Two [1 mM EDTA, 40 mM phosphate buffer 
(pH 7.2), 1% (w/v) SDS]. The air-dried filters were subjected to autoradiography 
(X-Omat X-ray film, Kodak) at -80°C with an intensifying screen. The RNA 
blots were hybridized with a cDNA probe representing either the MT cDNA clone 
10 isolated from library 1 or 3, using the Rad-Prime DNA Labeling System (Gibco 
BRL) to label the DNA probes. 

Genomic DNA isolation and Southern Blotting 

Leaf tissue was ground with a mortar and pestle under liquid nitrogen and 
added to a tube containing pre-warmed (65 ''C) DNA isolation buffer. The 
mixture was incubated at 65°C for 30 minutes, then extracted twice with an equal 
volume of chloroform. After the second extraction, DNA was precipitated from 
the aqueous phase by the addition of an equal volume of isopropanol, and mixed 
by gently inverting the tube. DNA was pelleted by centrifugation, washed with 
70% ethanol, dried briefly, and resuspended in TE (pH 8.0). DNA samples were 
treated with RNase, then phenol extract with TE buffered phenol by rocking 
gently, chloroform extracted, and precipitated with 2.5 vol ethanol. 

For the genomic Southern blots, 15 fxg of genomic DNA was digested with 
restriction endonucleases BamHI, HinDIII, EcoRI, PstI, and Sail (Promega), and 
restriction fragments were separated by electrophoresis on a 0.7% agarose gel. 
DNA in the gel was denatured (1.5 M NaCl, 0.5 M NaOH) and neutralized (1.5 
M NaCl, 0.5 M Tris, pH 8.0) before being transferred to nylon membrane (S&S 
Nytran Plus) using a BioRad vacuum transfer apparatus. DNA was covalently 
crosslinked to membrane by UV irradiation (Stratalinker, Stratagene), and the 
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membrane was hybridized separately with probes corresponding to the MT cDNA 
clones isolated from the banana pulp cDNA libraries from PCI 1 and 3 (MT-Fl 
and MT-F3). 

5 Genomic library screening and mapping 

Approximately 6 x 10^ primary plaques from a Musa acuminata cv Grand 
Nain X FIX genomic library (Stratagene) were screened with the MT cDNA probe 
isolated from the PCI pulp cDNA library (MT-Fl). Plaque-lifts containing filter- 
bound Xphage DNA was denatured for two min in 1.5 M NaCl, 0.5 M NaOH, 

10 and neutralized for four minutes in 1.5 M NaCl, 0.5 M Tris (pH 8.0). Filters 
were rinsed in 0.5 M Tris (pH 8.0), blotted dry, and the DNA was covalently 
crosslinked to the filters by UV irradiation (Stratalinker, Stratagene). Plaque-Iifls 
were hybridized as described previously. Twenty-four positives were plaque 
purified, and Xphage DNA was isolated for generating maps of the region 

15 containing the MT gene. Southern blot analysis was used to determine the identity 
of the MT clone according to diagnostic restriction sites. Fragments of the 
genomic clones containing the coding region and 5' and 3' flanking region were 
subcloned into pBluescript KS, and subclones were mapped and sequenced. 

20 Sequencing and Data Analysis 

Small-scale alkaline lysis plasmid preparations followed by 
phenol: chloroform extraction and ethanol precipitation (Sambrook et al., 1989) 
yielded template plasmid DNA suitable for automated sequencing. Plasmid DNA 
templates were sequenced, using the T3 primer, on an ABI 373A DNA sequencer 
25 (Applied Biosy stems, Foster City, CA). 

Using the BLASTX search algorithm, it was determined that the banana 
cDNA clones shared significant sequence homology with MT cDNA clones 
isolated from other fruit. The deduced amino acid sequences of plant MT cDNA 
clones were aligned using Clustal. A dendrogram showing the relationship among 
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several different classes of plat MTs was generated from the deduced amino acid 
sequences using Clustal. 

Protoplast isolation and dot blot analysis of MT transcript abundance 
5 Protoplasts from banana pulp at PCI 4 were isolated as described in Khalid 

et al. (in preparation). 1 X 10^ protoplasts were incubated under experimental 
conditions for 4h at room temperature in protoplast isolation buffer (Khalid et al. 
1997), with gentle rocking to keep the cells suspended. The treatments included 
incubation with different concentrations of ascorbate (buffered to pH 7.0), CUCI2, 

10 and hydrogen peroxide from 1 to 100 mM. After the incubation, a crude RNA 
preparation from the protoplasts was spotted onto nylon membrane in duplicate. 
One membrane was hybridized to the F3 cDNA probe to determine relative 
transcript abundance of fruit-associated MT. The second membrane was 
hybridized with an 18S ribosomal RNA probe to assess RNA loading. The 

15 membranes were then exposed to a phosphorescent screen (Phosphorlmager, 
Molecular Dynamics) and the scanned images were quantified with the 
ImageQuant software. The relative abundance was normalized to the measure of 
total RNA loaded, and is expressed in arbitrary units. 

20 RESULTS 

The cDNA sequence of the banana fruit-associated MT clones is shown in 
Figure 9. The clones were isolated by differential screening of pulp cDNA 
libraries (Clendennen and May, 1997). Fl was isolated from the PCIl library, 
whereas F3 was isolated from the PCI3 library. The cDNA clones are slightly 
25 variable in size, and most of the differences in size and primary sequence occurs 
in the 3' untranslated region (UTR), with Fl having approximately 70 more bases 
than F3. The two banana cDNA sequences are 60% identical at the nucleotide 
level, and 81 % identical within the coding region. 
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While both of the banana fruit-associated MT polypeptides are 65 amino 
acids, the two cDNA clones encode distinct polypeptides. The N-terminal and C- 
terminal domains are well conserved, and separated by a variable spacer. In 
Figure lOA, an alignment of deduced amino acid sequences shows the degree of 
5 similarity among the different fruit-associated MTs from banana, kiwifruit, 

papaya, and apple. In panel B, the relationships among a variety of plant MTs is 
depicted in a dendrogram generated from a cluster together, as do the type 1 MT 
sequences. The fruit-associated MT sequences (banana, kiwifruit, papaya, and 
apple) cluster together and are distinct from both type I and type 2 plant MTs. 

10 Despite the sequence similarity, the size difference between the transcripts 

of the two banana MT cDNA clones allows them to be separated on a high 
percentage (2%) agarose gel and detected by northern blotting and hybridization 
separately with each probe (Figure 11). Transcript sizes of Fl and F3 as 
determined from northern analysis are approximately 540 and 430 bases, 

15 respectively. The larger transcript (Fl) is abundant in pulp, peel, and corm. It is 
also present in low abundance in banana leaves, but is not detected in roots. The 
smaller transcript (F3) is most abundant in leaves, present in pulp and peel, and 
barely detectable in root and corm tissue. 

Southern analysis using both cDNAs as probes indicates the presence of up 

20 to five copies of the fruit type MT - two copies with homology to Fl and three 
copies with homology to Fl (data not shown). Twenty-four genomic clones of 
fruit MT were isolated from the genomic library, and restriction maps of the 
region containing the MT gene indicated that three distinct genes had been 
isolated. Clones representing both the Fl and F3 cDNA clones were isolated 

25 from the genomic library, as well as a gene with homology to the fruit-associated 
MT F!, but for which no cDNA clone has been isolated (named MT-Flb). 
Subclones of these different MT genes were generated and the region containing 
the coding region as well as 5' and 3* flanking regions were mapped. Maps of the 
different MT genes, including the coding region and at least Ikb of 5' and 3' 
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flanking regions appear in Figure 12. Based on mapping and sequence data it can 
be determined that the MT F3 gene is comprised of three exons separated by two 
introns. The mapping resolution was not fine enough to determine the existence 
or position of introns in the other MT genes. The nucleotide sequence of the F3 
5 genomic clone from the Hindlll site to the Sail site, which includes the complete 
coding region, is depicted in Figure 13. Several features of the sequence are 
highlighted in the figure, including a 10-base 5' sequence motif beginning at -313 
from the translation start site (in capital letters) that shares homology with an 
antioxidant response element. The putative TATA-box (starting at position -96 

10 ft-om the translation start site) is underlined, and the three exons (beginning from 
the translation start site) are depicted in capital letters. At the 3' end of the 
sequence, the stop codon is underlined, as well as a potential polyadenylation 
signal (TAAATAAA). 

Because of the putative ARE identified in the 5' flanking sequence, the 

15 effect of antioxidants (ascorbate), oxidizing agents (H2O2), and metal ions (Cu^^) 
on MT transcript abundance was determined in banana pulp protoplasts. H2O2, 
but not copper ions, resulted in dramatic and dose-dependent increase in the 
relative abundance of the fruit-associated MT transcript (Figure 14). The presence 
of ascorbate resulted in a reduction in the relative MT transcript abundance as 

20 compared to an untreated control. 

DISCUSSION 

Eleven non-redundant groups of ripening-associated cDNA clones were 
isolated from banana pulp cDNA libraries by differential screening and identified 
25 by sequence homology (Clendennen and May, 1997). One of the groups of cDNA 
clones includes a previously uncharacterized type of metallothionein (MT), the 
transcript of which is found abundantly in ripening banana pulp. There are two 
classes of this ripening-associated MT transcript in banana pulp that vary in 
primary sequence and in size. Both the larger (Fl) and the smaller (F3) 
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transcripts increase in abundance in banana pulp during ripening, but Fl increases 
more dramatically than F3. In addition, the tissue distribution of these transcripts 
differs: MT-Fl is expressed abundantly in the pulp and peel, and slightly in corm 
tissue, whereas MT-F3 is expressed abundantly in pulp, peel, and leaves, and very 
5 slightly in roots. Based on the isolation of two distinct cDNA clones, it was 
suspected that the fruit-associated MTs were encoded by a small gene family. 
Southern analysis confirmed this, and suggested the presence of up to five 
members of the fruit-associated MT gene family in banana. Three different MT 
genes were identified after screening twenty-four genomic clones that hybridized 
10 to Fl and F3 cDNA probes, as determined by restriction mapping of the segment 
containing the coding region. Genomic clones representing both cDNA clones 
were isolated. 

While the invention has been described and illustrated herein by references 
15 to various specific material, procedures and examples, it is understood that the 

invention is not restricted to the particular material, combinations of material, and 
procedures selected for that purpose. Numerous variations of such details can be 
implied and will be appreciated by those skilled in the art. 
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WHAT IS CLAIMED IS: 

1 . An isolated and purified banana DNA molecule, said DNA molecule being 
differentially expressed during banana fruit development. 

5 2. A DNA molecule according to claim 1, wherein said DNA molecule 
encodes a protein selected from the group consisting of a starch synthase, a 
granule-bound starch synthase, a chitinase, an endochintinase, a beta-1,3 
glucanase, a thaumatin-like protein, an ascorbate peroxidase, a metallothionein, a 
lectin, and another senescence-related protein. 

10 

3. A DNA molecule according to claim 1, selected from the group consisting 
of clones pBAN 3-33, pBAN 3-18, pBAN 3-30, pBAN 3-24, pBAN 1-3, pBAN 
3-28, pBAN 3-25, pBAN 3-6, pBAN 3-23, pBAN 3-32, and pBAN 3-46. 

15 4. A DNA molecule according to claim 1, wherein said DNA molecule has 
the nucleotide sequence selected from the group consisisting of SEQ ID NO: 1, 
SEQ ID NO: 2, and SEQ ID NO: 3. 

5. A DNA molecule according to claim 1, wherein said DNA molecule 
20 encodes a protein having an amino acid sequence selected from the group 

consisting of SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, the DNA sequence 
shown in Figure 16, the DNA sequence shown in Figure 17, the DNA sequence 
shown in Figure 18, and the DNA sequence shown in Figure 19. 

25 6. A chimeric gene comprising a DNA molecule which is diferentially 
expressed during banana fruit development operably linked to a heterologous 
promoter. 

7. A replicable expression vector comprising the chimeric gene of claim 6.. 
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8. A plant genome, comprising the chimeric gene of claim 6. 

9. A plant cell, comprising the chimeric gene of claim 6. 

5 10. A plant comprising the chimeric gene of claim 6, wherein said chimeric 
gene is stably integrated into the plant genome. 

1 1 . An isolated and purified banana protein which is differentially produced in 
developing banana fruit. 

10 

12. A protein according to claim 11, wherein said protein is a selected from 
the group consisting of a starch synthase, a granule-bound starch synthase, a 
chitinase, an endochintinase, a beta- 1,3 glucanase, a thaumatin-like protein, an 
ascorbate peroxidase, a metallothionein, a lectin, and another senescence-related 

15 protein. 

13. A protein according to claim 11, wherein said protein is encoded by a 
DNA molecule selected from the group consisting of clones pBAN 3-33, pBAN 
3-18, pBAN 3-30, pBAN 3-24, pBAN 1-3, pBAN 3-28, pBAN 3-25, pBAN 3-6, 

20 pBAN 3-23, pBAN 3-32, and pBAN 3-46. 

14. A protein according to claim 11, wherein said protein has an amino acid 
sequence selected from the group consisting of SEQ ID NO: 4, SEQ ID NO: 5, 
SEQ ID NO: 6, the amino acid sequence shown in Figure 16, the amino acid 

25 sequence shown in Figure 17, the amino acid sequence shown in Figure 18, and 
the amino acid sequence shown in Figure 19. 

15. A composition comprising the protein of claim 11 and a carrier therefor. 
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16. A plant cell comprising the protein of claim 1 1 . 

17. An isolated and purified banana DNA regulatory element which is 5' or 3* 
to a gene which is differentially expressed during banana fruit development. 

5 

18. A regulatory element according to claim 17, wherein said regulatory 
element is activated by an ethylene signal. 

19. A regulatory element according to claim 18, wherein the ethylene signal is 
10 produced by developing fruit. 

20. A regulatory element according to claim 18, wherein the ethylene signal is 
produced by exogenous ethylene gas. 

15 21. A chimeric gene comprising a banana DNA regulatory element operably 
linked to a heterologous DNA molecule, wherein said regulatory element is 
naturally found 5' or 3* to a gene which is differentially expressed during banana 
fruit development. 

20 22. A plant genome comprising a chimeric gene according to claim 21. 

23. A plant cell transformed with a chimeric gene according to claim 21. 

24. A plant comprising a chimeric gene according to claim 21, wherein said 
25 chimeric gene is stably integrated into the plant genome. 

25. A method for expression of heterologous protein in fruit comprising 
transforming fruiting plants with a chimeric gene according to claim 21, exposing 
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said fruit to an plant development signal, and harvesting fruit containing said 
heterologous protein. 

26. The method of claim 25, wherein the plant development signal is ethylene 
5 gas produced by ripening fruit. 

27. The method of claim 25, wherein the plant development signal is 
exogenous ethylene gas. 

10 28. The method of claim 25, further comprising the step of isolating the 
heterologous proteins from the harvested fruit. 

29. The method of claim 25, wherein the heterologous protein is a therapeutic 
protein 



15 



30. 



A fruit produced by the method of claim 25, 



31. 



The fruit of claim 30, wherein said fruit is a banana. 



20 



32, 



A protein produced by the method of claim 25. 



33. 



A protein produced by the method of claim 28. 
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I I I I 



I I I I I I I I ' I I I — I I I 



-t— H — (— H 



I I I — I I I I I I I I I I I I I I I I I I I I I 



GTTCAGGAGTAGGTCTTAAAACCTTCCCTATTCCTACCACCCCTCTTTCT 

KSSSRILEGIRMVGRK 
PSPHPEFWKG GWWGER 
QVL I QNFGRDKDGGEKE 



ACAAGCTGTTGCCTTTCGTTTTCTTCTATCAGGAAGCCAAGAG ITTTCAAl G 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



TGGTCGACAACGGAAAGCAAAAGAAGATAGTCCTTCGTTTCTCAAAGTTC 
NKLLPFVFFYQEAKSFK 
TSCCLSFSSIRKPRVSR 
QAVAFRF LLSGSQEFQ 



AGGAGGGTAGACCTGAGGGGATGATGCCTGTGTCGAAACCTCTATATAAG 



I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I 



I I I I I I I I I I I I I I I r I 



TCCTCCCATCTGGACTCCCCTACTACGGACACAGCTTTGGAGATATATTC 
RRVDLRG.CLCRNLYIR 
GG.T.GDDACVETSI 
EEGRPEGMMPVSKPLYK 

1425 

GAGTAGGAACACAGCATGTTGATGfAACACAAACCATTTCAGCGGGGAAGA 



I I I I I I I I I 



1 I I I I I I ] I I I I 



I I I I I I I I I I I I 1 I I I I I I I I ' I ' I 



CTCATCCTTGTGTCGTACAACTACliJTGTGTTTGGTAAAGTCGCCCCTTCT 

SRNTAC. .TQTISAGK 
GVGTQHVDEHKPFQRGR 
E.EHSMLMNTNHFSGEE 



1479 



'Hind III 



agagaacccttttgacagagttgttgtcaTgIgcaacaaaagcttctctct 

I I I I I I I I I I I I I I I I I I I I I I I I I 



tctcttgggaaaactgtctcaacaacagtaccgtIgtttIcgaagagag^ 
krtlltellswqqklls 
repf.qscchgnksfsl 
enpfdrvvvmatkasl 
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CCATAAAAGGCTTTGCCTTGCTGGTTTCAGTCCTTGTAGCAGTTCC'AACA' 
I I I ' I I I I ' I I ' I ' I I' I I I I' I I I I I I I I ' I I I I I I I I I I I I I I I I I I I 
GGTATTTTCCGAAACGGAACGACCAAAGTCAGGAACATCGTCAAGGTTGT 

P KALPCWFQSL QFQQ 

HKRLCLAGFSPCSSSN 
S I KGFALLVSVLVAVPT 

24X[TC] 

AlGjTTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCT 

I I ' I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA 

VLSLSLSLSLSLSLSL 
KFSLSLSLSLSLSLSLS 
SSLSLSLSLSLSLSLSL 

T R V 



ACA AGA GTG 
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CTCATATTATACATTTGATTGTTAGCTCTTACAAATTTATTAGGGTTTTT 

I I I I I I I I I I I I I I I I I I I I I ' I I I I I I I I I I ' I I I I ' I I I ' I I I I I I I I 
GAGTATAATATGTAAACTAACAATCGAGAATGTTTAAATAATCCCAAAAA 

SHI IHLIVSSYKFIRVF 
LILYI LLALTNLLGFL 
SYYTFDC .LLQIY.GF 

'Hind III 

ataagagttca'agcttttggtaatttaatcatggtaggttatattttcaa 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I 

tattctcaagttcgaaaacgcttaaattagtaccatccaatataaaagtt 

irvqafgnlimvgyifk 
efkllvi .sw.vifs 
yksssfw. fnhgrlyfq 

aacttgtaacctgcattttgtctctttatttcatgcaatattcttttcct 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' M ' I I ' I I I ' I i 
TTGAACCTTGGACGTAAACCAGAGAAATAAAGTACGTTATCCGAAAAGGA 

TCNLHFVSLFHA I FFS 
KLVTC I LSLYFMQYSFP 
NL.PAFCLFISCNILFL 

TGATTGGCTTACGTCATTTACTTGAGTTAGCTCATATGTAACTGTTTAAA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I ' I I I I I ' I I ' I I I I I' I 
AGTAACCGAATGCAGTAAATGAACTCAATCGAGTATACATTGACAAATTT 

LIGLRHLLELAHM.LFK 
LAYVIYLS.LICNCLN 
DWLTSFT VSSYVTV 

TATTTGGGATTATTGGTTAACGGATAAAAAAAATTAAGATTTTAGATACA 
I I I I I I I I I I I I I 1 I I — I I I I I I — I— I — I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTAAACCCTAATAACCAATTGCCTATTTTTTTTAATTCTAAAATCTATGT 

YLGLLVNG.KKLIDFRY 
IWDYWLTDKKN L I LD 

IFGI IG.RIKKIN.F. I 

27 X [TA] 

A TGCTATAT AT AT AT AT AT AT AT AT A TAT AT AT AT AT AT AT AT AT A TAT A 
I I I I I I I I I I I I I I 1 I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I 
TACGATATATATATATATATATATAT ATATATATATATATATATATATAT 

NAIYIYIYIYIYIYIY 
TMLYIYIYIYIYIYIYI 
QCYIYIYIYIYIYIYIY 

TATATATATATTATAGGTAGAAACTTGGTATAATTCACACGTATGTTCGC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 1 I I I 
ATATATATATAATATCCATCTTTGAACCATATTAAGTGTCCATACAAGCG 

lYIYYR.KLGIIHTYVR 
YIYIIGRNLV.FTRMFA 
lYI L VETWYNSHVCS 
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TTTATCTGAATAAAATGAGTAGTCCTTTCAATGCAGATTAGTCTTACTCC 



Fl IK.VVLSMQISLTP 
LSE NE SFQCRLVLL 

LYLNKMSSPFNAD SYS 

ACTTGCAGATGCACGACCAATTTGCTTGATCATCTTCCATAGAGCACCAC 

4—1 — I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I r I I I I I I I I I I I I I I 
TGAACGTCTACGTGCTGGTTAAACGAACTAGTAGAAGGT ATCTCGTGGTG 

LADARPICLIIFHRAP 
HLQMHDQFA.SSSIEHH 
TCRCTTNLLDHLP.STT 



AGCTAAGTCTCCGATGTGTTCTACTGCAGGAGTGCAATCGATTGGTGTGT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCGATTCAGAGGCTACACAAGATGACGTCCTCACGTTAGCTAACCACAGA 

QLSLRGVL LQECNRLVS 
S VSDVFYCRSA I DWCL 

AKSPMCSTAGVQ SIGV 

GCTACGGAATGCTCGGCAACAATCTTCCCCCGCCCAGCGAGGTGGTCAGT 

H — I — I I I I I I I I I I I I I I I I I I I I — I — I I I I I I I I I I I I I I I I I — I I I I I I I I I I 

CGATGCCTTACGAGCCGTTGTTAGAAGGGGGCGGGTCGCTCCACCAGTCA 
ATECSAT I FPRPARWSV 
LRNARQQSSPAQRGGQ 
CYGMLGNNLPPPSEVVS 

CTCTACAAATGCAACAACATCGCGAGGATGAGACTCTACGATCCAAACCA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 

GAGATGTTTAGGTTGTTGTAGCGCTCCTACTCTGAGATGCTAGGTTTGGT 

STNPTTSRG.DSTIQT 
SLQIQQHREDETLRSKP 
LYKSNN I ARMRLYDPNQ 




ACA 



NGA 



GTG 
V 



PstI 
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GGCCGCCCTGCAAGCCCTCAGGAACTCCAACATCCAAGTCCTGTTGGATG 

I I ' I I I ' I I ! I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCGGCGGGACGTTCGGGAGTCCTTGAGGTTGTAGGTTCAGGACAACCTAC 
RPPCKPSGTPTSKSCWM 
GRPASPQELQHPSPVGC 
AALQALRNSN IQVLLD 

TCCCCCGATCCGACGTGCAGTCACTGGCCTCCAATCCTTCGGCCGCCGGC 

-r ' I ' I I I ' I I ' I I ' I I ' I ' I I ' I I I I I I I I I I I I I I I I I I I I I r I I I I I I 
AGGGGGCTAGGCTGCACGTCAGTGACCGGAGGTTAGGAAGCCGGCGGCCG 

SPDPTC SHWPP I LRPPA 
PP I RRAVTGLQSFGRR 
VPRSDVQSLASNPSAAG 

' BamH I 

I 

GACTGGATCCGGAGGAACGTCGTCGCCTACTGGCCCAGCGTGTCCTTTCG 
I I ' I I ' I ' I I I I ' I M ' I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I I I I 
CTGACCTAGGCCTCCTTGCAGCAGCGGATGACCGGGTCGCAGAGGAAAGC 

TGSGGTSSPTGPASPF 
RLDPEERRRLLAQRLLS 
DWI RRNVVAYWPSVSFR 

ATACATAGCTGTCGGAAACGAGCTGATCCCCGGATCGGATCTGGCGCAGT 

I I I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I 

TATGTATCGACAGCCTTTGCTCGACTAGGGGCCTAGCCTAGACCGCGTCA 

DT .LSETS SPDRIWRS 
IHSCRKRADPRIGSGAV 
YIAVGNELIPGSDLAQ 
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ACATCCTCCCCGCCATGCGCAACATCTACAATGCTTTGTCCTCGGCTGGC 
-) — I — I I I I I I I I I I I I — I I I I I I I I I I I I I I — I — |— I — I I I I I I I I I I I I I I I I I I I 
TGTAGGAGGGGCGGTACGCGTTGT AGATGTTACGAAACAGGAGCCGACCG 

TSSPPCATSTMLCPRLA 

HPPRHAQHLQCFVLGW 

Y I LPAMRN I YNALSSAG 

•Sail 

ctgcaaaaccagatcaaggtctcgaccgcgg'tcgacacgggcgtcctcgg 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

gacgttttggtctagttccagagctggcgccagctgtgcccgcaggagcc 

cktrsrsrprstrass 
pakpdqgldrgrhgrpr 
lqnqikvstavdtgvlg 



CACGTCCTACCCTCCCTCCGCCGGCGCCTTCTCCTCCGCCGCCCAGGCGT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTGCAGGATGGGAGGGAGGCGGCCGCGGAAGAGGAGGCGGCGGGTCCGCA 

ARPTLPPPAPSPPPPRR 
HVLPSLRRRLLLRRPGV 
TSYPPSAGAFSSAAQA 

ACCTGAGGCCCATCGTGCAGTTCTTGGCGAGTAACGGAGCGCCGCTCCTG 
!■ I I t I ' I ' ' I I I ' I I I ' I ' I I I I ' I ' I ' I I t ' I I I I ' I ' I ' I I I I ' I I I I 
TGGACTCGGGGTAGCACGTCAAGAACCGCTCATTGCCTCGCGGCGAGGAC 

T APSCSSWRVTERRSW 
PEPHRAVLGE RSAAP 
YLSPIVQFLASNGAPLL 

jSmal jBglll 

gtcaatgtgtacccttattttagctacaccggcaaccc'gggaca'gatctc 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I < I I I I I I I I 

cagttacacatgggaataaaatcgatgtggccgttgggccctgtctagag 

smctli'latpatrdrs 
gqcvplf.lhrqpgtdl 
vnvypyfsytgnpgq I s 



gctgccgtacgccctgttcacggcctccggcgtcgtcgtgcaggatgggc 

I I I I I I I I I f- I I I i I I' I ' I ' I I ' I ' I I I I I 'I I I I I I ' I I I I I I ' I I I I 

cgacgggatgcgggacaagtgccggaggccgcagcagcacgtcctacccg 

rcptpcsrppasscrmg 
aalrpvhglrrrragwa 
lpyalftasgvvvqdg 

jSall 

gattcagctatcagaacctgttcgacgccatcg'tcgacgcggtcttggcg 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I ' I I I I 

ctaagtcgatagtcttggacaagctgcggtagcaggtgcgccagaagcgc 

dsairtcstpsstrssr 
iqlsepvrrhrrrglr 

R XSYQNLFDA i vdavfa 



FIG. 15D-1 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 9915668A2_L> 



wo 99/15668 



PCTAJS98/03343 



27/91 

GCGCTGGAGAGAGTGGGAGGGGCGAACGTGGCGGTGGTGGTGTCGGAGAG 
I I I I I I 1 I I I I I I I I I I I I I I I I I I ' I ' I I I I I M I I I ' I ' I I I I I I I I I 
CGCGACCTCTCTCACCCTCCCCGCTTGCACCGCCACCACCACAGCCTCTC 

RWREWEGRTWRWWCRR 
GAGESGRGERGGGGVGE 
ALERVGGANVAVVVSE S 

CGGGTGGCCGTCGGCGGGCGGAGGAGCCGAAGCGAGCACCAGCAACGCGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I 

GCCCACCGGCAGCCGCCCGCCTCCTCGGCTTCGCTCGTGGTCGTTGCGCG 
AGGRRRAEEPKRAPATR 
RVAVGGRRSRSEHQQRA 
GWPSAGGGAEASTSNA 

AGACGTACAACCAGAACTTGATCAGGCATGTTGGCGGAGGAACGCCGAGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCTGCATGTTGGTCTTGAACTAGTCCGTACAACCGCCTCCTTGCGGCTCC 
RRTTRT SGMLAEERRG 
DVQPELDQACWRRNAE 
QTYNQNL I RHVGGGTPR 



AGACCAGGGAAGGAGATCGAGGCATACATATTCGAGATGTTCAACGAGAA 
I I I I I I I I I I I I I I I I ' I ' I ' I I ' I I I ' I I I I ' I I I I I I I ' I I I I I I ' I I 
TCTGGTCCCTTCCTCTAGCTCCGTATGTATAAGCTCTACAAGTTGCTCTT 

DQGRRSRHTYSRCSTR 
ETREGDRGIHIRDVQRE 
RPGKE I EAY I FEMFNEN 



CCAGAAGGCTGGAGGGATCGAGCAGAACTTTGGCCTGTTTTATCCCAACA 
I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGTCTTCCGACCTCCCTAGCTCGTCTTGAAACCGGACAAAATAGGGTTGT 

TRRLEGSSRTLACF I PT 
PEGWRDRAE LWPVLSQQ 
CKAGGIEQNFGLFYPN 

!Hind III 



AGCAGCCCGTATACCAAATAAGCTTT ITAGI AAACTAACTTGTAAGGTTGAT 

I I I I I I I I I I I I I I I I I I I ' I ' I I I I I I I I I I ' I I I I ' I ' I I I I I I I' I 



TCGTCGGGCATATGGTTTATTCGAAAATCTTTGATTGAACATTCCAACTA 
SSPYTK.AFRN.LVRLM 
AARIPNKLLETNL.G. 
KQPVYQI SF KLTCKVD 

5 X [CTAC] 

GAATCATCTCCTACCTACCTACCTACCTACGAATAAAACATGAAATAAAG 
I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTTAGTAGAGGATGGATGGATGGATGGATGCTTATTTTGTACTTTATTTC 
NHLLPTYLPTNKT.NK 
I ISYLPTYLRIKHEIK 
ESSPTYLPTYE.NMK.S 
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; EcoR I ; CDNA EUCLS (POLY A) 

CACCAAAATAAAGGGAGAATCTTGATCTTGGAGAAAGTTGAATCATGATG 

I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GTGGTTTTATTTCCCTCTTAGAACTAGAACCTCTTTCAACTTAGTACTAC 
APK.RENSDLGES. IMM 
HQNKGRILILEKVES.. 
TN I KGEF SWRKLNHD 



ATATATAACAAACACCCCTCTTTACTCATTATCAGTATGTTACAAGTTTC 

I I I I I I ' I I I I I I I I I 1 I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I 

TATATATTGTTTGTGGGGAGAAATGAGTAATAGTCATCCAATGTtCAAA^ 
lYNKHPSLLI ISMLQVS 
Y I TNTPLYSLSVCYKF 
Dl .QTPLFTHYQYVTSF 

TTGAAACTTGAACGGATGACAATTTGGACCTACAAGTATTTTGGGTCATA 

I ■ I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AACTTTGAACTTGCCTAGTGTTAAACCTGGATGTTCATAAAACCCAGTAT 

NLNGSQFGPTS I LGH 
LET.TDHNLDLQVFWVI 
LKLERITXWTYKYFGS. 



ATTATTTCATTGAACTATATATTCAAAAAAAGATGTGTTTGGAGTGCTTA 

I I I ' I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TAATAAAGTAACTTGATATATAAGTTTTTTTCTACACAAACCTCACGAAT 

NYFIELYIQKKMCLECL 
IISLNYIFKKRCVWSA. 
LFH.TIYSKKDV FGVL 

ATACAGTATGACTTCAGTTTGCAAGATTACCTCTTCAGCGTGAGCTTCAG 

I I I ' I ' I I I I I I I I I ' I ' I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 

TATGTCATACTGAAGTCAAACGTTCT AATGGAGAAGTCGCAGTCGAAGTC 

I QYDFSLQDYLFSVSFS 
YSMTSVCK I TSSASAS 
NTV.LQFARLPLQRQLQ 

CATGCCAAAAAACCATCATCTGCTATGGGGCATGTTTTACACCTTGATGG 

I I I I I I I ' I I I I ' I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I 

GTACGGTTTTTTGGTAGTAGACGATACCCCGTACAAAATGTGGAACTACC 

MPKNHHLLWGMFYTLM 
ACQKTI ICYGACFTP.W 
HAKKPSSAMGHVLHLDG 
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TGCTACATCATCATCATTCATGTTTCATTTTAGGTCTCGTGCTCTTTATA 
I I I I I I I I I — I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I 
ACGATGTAGTAGTAGTAAGTACAAAGTAAAATCCAGAGCACGAGAAATAT 

VLHHHHSCF I LGLVLF I 
CYI I I IHVSF.VSCSLY 
ATSSSSMFHFRSRALY 

TAGATCACATAAAAGTTTGGATCGCTTCAAGTTTCTAGGTTACATTGTAT 
I I I I I I I I I I I I I I I I I I ' I ' I I I I ' I I ' I I I I I I I ' I I I I I I ' I I I ' I I 
ATCTAGTGTATTTTCAAACCTAGCGAAGTTCAAAGATGCAATGTAACATA 

IT.KFGSLQVSRLHCM 
RSHKSLDRFKFLGYIV 
IDHIKVWIASSF.VTLY 

GCAGCACTTTGAGCCTACTGAACATTGTGACTGCCTTTTAGAACATTGGA 
I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CGTCGTGAAACTCGGATGACTTGTAACACTGACGGAAAATCTTGTAACCT 

QHFEPTEHCDCLLEHW 
CSTLSLLNIVTAF.NIG 
AAL.AY.TL.LPF RTLD 

jPstl 

CTGCAGGAA 

I I I I I I I I — — 3559 

GACGTCCTT 

TAG 
L Q E 
C R K 
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iSall 

agcgagg'tcgactaatgagctactaacattaatgtcacagatagtaatag 

I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I ' I I I I 
tCGCtCCAGCTGATTACTCGATGATTGTAATTACAGTGTCTATCATTATC 

SEVD. .ATNINVTDSNR 
ARSTNELLTLMSQIVI 
QRGRLMSY.H.CHR. 

ATGAGAAGCCGTATCCAACACGCAATCTGTANACTTGGTCACAGGACTTC 
I I I I I 1 I I I I I I I I I I I I ( I I I I I I t I I I I I I I I I I I I I I I I I I I ' I I I I 
TACTCTTCGGCATAGGTTGTGCGTTAGACATNTGAACCAGTGTCCTGAAG 

EAVSNTQSV7LVTGL 
DEKPYPTRNL7TWSQDF 
MRSRIQHAIC7LGHRTS 

TTATCCAAAGACTCGCCTCTGCGATTTCCCACATTCACCTCATTTGGTCC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I '' ' M 
AATAGGTTTCTGAGCGGAGACGCTAAAGGGTGT AAGTGGAGTAAACCAGG 

LIQRLASAISHIHLIWS 
LSKDSPLRFPTFTSFGP 
YPKTRLCDFPHSPHLV 

•Hind III 

ATAGGAAGCTTCACAGCGGGCAGGAATCCATTTCTCTATATAAGCACCAC 

I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

tatccttcgXagtgtcgcccgtccttaggtaaagagatatattcgtggtg 
igsftagrnpflyistt 
easqragihfsi .ap 

HRKLHSGQES I SLYKHH 

CTCCCACCCACACCAGCACCACTACCACTGCTAAGGAGGATGAAGGCCTT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 'I I I I I ' I I I I II '' ' J 1 
GAGGGTGGGTGTGGTGGTGGTGATGGTGACGATTCCTCCTACTTCCGGAA 

SHPHHHHYHC.GG.RP 
PPTHTTTTTTAKEDEGL 
LPPTPPPL P LLRRMKA L 

GTTGTTGGTCATCTTTACCCTGGCCTCGTCGCTCGGCGCCTTCGCC GAGC 
I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I 
CAACAACCAGTAGAAATGGGACCGGAGCAGCGAGCCGCGGAAGCGGCTCG 

CCWSSLPWPRRSAPSPS 
VVGHLYPGLVARRLRRA 
LLV I FTLASSLGAFAE 
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AATGCGGAAGGCAAGCCGGGGGGGCTCTCTGCCCCGGCGGGCTGTGCTGT 

I I ' I i I I I I ' I I I I I ' I I I I ' I I ' I ' ' ' ' I ' ' ' I I I I I I I I I I 1 I I I I I I 

TTACGCCTTCCGTTCGGCCCCCCCGAGAGACGGGGCCGCCCGACACGACA 

NAEGKPGGLSAPAGCAV 
MRKASRGGSLPRRAVL 
QCGRQAGGALCPGGLCC 

jBamH I 

agccagta cggctggtgcggtaacacg'gatccatactgcggccaaggatg 

TCGGTCATGCCGACCACGCCATTGtGCCTAGGTAtGACGCCGGTTCCTAC 
ASTAGAVTR I HTAAKD 
PVRLVR. HGS I LRPRM 
SQYGWCGNTDPYCGQGC 

CCAGA GCGAATGCGGCGGTAGCGGCGGTAGCGGCGGTGGCAGCGTGGCCT 

i i i ' i ' i i i i i i i i i i i i i i 1 i 1 i i i i i i i i i i 1 i i i i i i r i i i i i i i i i 

GGTCTCGGTTACGCCGCCATCGCCGCCATCGCCGCCACCGTCGCACCGGA 

aranaavaavaavaawp 
pepmrr.rr.rrwqrgl 
qsqcggsggsgggsva 



cgajcatcagctcctccctcttcgagcagatgctgaagcatcgcaacgac 

I I I I I I I I ' I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCTAGTAGTCGAGGAGGGA^AAGCtCGTCtACGACTTCGtAGCGTTGCTG 

Rssappsssrc.siatt 
dhqllplradaeasqr 
s i issslfeqmlkhrnd 

gcagc ctgccccggcaagggtttctacacgtacaacgccttcatcgccgc 

1 ' I ' I 'I I I I I' I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CGTCGGACGGGGCCGTTCCCAAAGATGTGCATGTtGCGGAAGTAGCGGCG 

QPAPARVSTRTTPSSP 
RSLPRQGFLHVQRLHRR 
AACPGKGFYTYNAFIAA 

CGCCAACTCCTTCAGCGGGTTCGGGACGACCGGCGACGACCCAAGAAGAA 

I 'I I I I I I I I I I I I I '■ I ' I I I I I I I I I 1 I I I I I I I I I I 1 I I I 

GCGGTTGAGGAAGTCGCCCAAGCCCtGCtdGCCGiTGCTGGGTTCTTCTT 

pptpsagsgrpattqee 
RQllqrvrddrrrpkk? 
ansfsgfgttgddprr 
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NAAGGAGATCGCGGCTTTCTTGGCGCANACGTCTCACGANACGACAGGTA 

I I I I I I I I I I I I I I I I I I I I I I ' I I ' I I I I I I I I I I I I ' I ' ' ' ■!■ I ' ' i 1 J - 
NttCCTCTAGCGCCGAAAGAACCGCGTNTGCAGAGTGCTNTGCTGTCCAT 

?GDRGFLGA?VSR?DR. 
KE I AAFLA?TSH?TTG 
?RRSRLSWR?RLT?RQV 

ATTCNCACATCTCCCGAAGCTCGTAAACTGTTTATGGGATANAAA ACTGA 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I ' I ' I 'll '' ' '! 
TAAGNGTGTAGAGGGCTTCGAGCATTTGACAAATACCCTATNTTTTGACT 

F?HLPKL VNCLWD?KL 
NSHISRSS.TVYGI7N. 
l?TSPEARKLFMG?KTE 

ATGTTTGGGGTTTGGCAGGTGGGTNGGCGACGCGCCCGATGGTCCGTACG 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
tACAAACCCCAAACCGTCCACCCANCCGCTGCGCGGGCTACCAGGCATGC 

NVWGLAGG7ATRPMVRT 
MFGVWQVG? RRARWSVR 
CLGFGRWVGDAPDGPY 

CCTTGGGTTACTGCTTCGTCCAANAACAAAACCCTCATCGGANTAC TGCG 

I I I I I I I I I I I I I I I I I I I I I I I I I I 

^GAACCCAATGACGAAGCAGGTTNTTGTTTTGGGAGTAGCCTNATGACGC 

PWVTASS7NKTL tG?LR 
LGLLLRP?TKPSS?YC 
ALGYCFVQ?QNPHR?TA 
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jPstl 

TCCCANCTCCCANTGGCCGTGCGCTGCAGCAAAAAATACTACGGCCGAAG 

I I I I I I I I I I I I I t I I I 1 I I I I I 1 I t I I I I I 1 I I I I I I I I I I I I I I I I I I 

AGGGTNGAGGGTNACCGGCACGCGACGTCGTTTTTTATGATGCCGGCTTC 

P?S?WPCAAAKNTTAE 
VP?P?GRALQQK 1 LRPK 
P?LP?AVRCSKKYYGRS 

CCCNTCCAAATTTCATNGTNAGCCANATTCTNACAGTTCNTCGCCGCGAT 
I I I I I I I I I i I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t 
GGGNAGGTTTAAAGTANCANTCGGTNTAAGANTGTCAAGNAGCGGCGCTA 

A?PNF?VS? I LTV7RRD 
P?QIS??A?F?QF?AA1 
PSKFH??P?S?SSSPR 

CGAGTTCACAACGATGCCNTTTCTAACGCAACAATCCGATGTGTTNTGCG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I I I 
GCTCAAGTGTTGCTACGGNAAAGATTGCGTTGTTAGGCT ACACAANAGGC 

RVHNDA7SNAT 1 RCV?R 
EFTTMPFLTQQSDV7C 
SSSQRC7F RNNPMC7A 

TGGAGCAANTACAANTACGGGCCGGCCGGGAGAGCCATCGGTTCNGAGNT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 
ACGTCGTTNATGTTNATGCCCGGCCGGCCCTCTCGGTAGCCAAGNCTGNA 

AA7T7TGRPGEPSV7T 
VQQ7Q7RAGRESHRF77 
CS7Y7YGPAGRA I GSD? 

GNTCAACAACCCAGACCTGGTGGCCACNGACGCGACCATCTCNTTCAAGA 
I I I I I i I I I I I I I I I 1 1 I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 
CNAGTTGTTGGGTCTGGACCACCGGTGNCTGCGCTGGTAGAGNAAGTTCT 

7STTQTWWP7TRPS7SR 
7QQPRPGGH7RDHL7QD 
7NNPDLVATDAT I SFK 

CGGNTCTGTGGTTTTGGATGACTCNTCAGTCGCCGAAGCCGTNGTGCCAC 
I I I ) I I I I I I I I I I I I I 1 I I 1' 'I I ' I 'I I 'I I ' I ' I' I I ' I ' I M ' I ' I 
GCCNAGACACCAAAACCTACTGAGNAGTCAGCGGGTTCGGCANCACGGTG 

R7CGFG. L7SRPSR7AT 
RSVVLDDSSVAQAVVP 
T7LWFWMT7QSPKP7CH 
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GACGTGATAACCGGGAGCTGGACGCCATCCAACGCCGACCAGGCGGCCGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

CTGCACTATTGGCCCTCGACCTGCGGTAGGTTGCGGCTGGTCCGCCGGCC 

T . PGAGRHPTPTRRP 

RRDNRELDAIQRRPGGR 
DV I TGSWTPSNADQAAG 

AAGGCTTCCGGGCTACGGTGTCACCACCAACATCATCAATGGAGGGTTGG 

I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I t I I I I I I ' I I ' I ' I ' I I 
TTCCGAAGGCCCGATGCCACAGTGGTGGTTGTAGTAGTTACCTCCCAACC 

EGFRATVSPPTSSMEGW 
KASGLRCHHQHHQWRVG 
RLPGYGVTTNI INGGL 



AGTGCGGGAAAGGGTACGATGCCAGGGTGGCGGATAGGATCGGCTTCTAC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I 
TCACGCCCTTTCCCATGCTACGGTCCCACCGCCTATCCTAGCCGAAGATG 



SAGKGTMPGWR I GSAST 
VRERVRCQGGG DRLL 
ECGKGYDARVADR I GFY 

AAGAGGTACTGCGACTTGCTGGGGGTGAGCTACGGAGACAACTTGGACTG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I' I ' I 
TTCTCCATGACGCTGAACGACCCCCACTCGATGCCTCTGTTGAACCTGAC 

RGTATCWG ATETTWT 
QEVLRLAGGELRRQLGL 
KRYCDLLGVSYGDNLDC 



CTACAACCAGAGACCCTTTGCTTCTACAGCAGCTACAGCCACATTCTAGC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I ' I ' I 
GATGTTGGTCTCTGGGAAACGAAGATGTCGTCGATGTCGGTGTAAGATCG 

ATTRDPLLLQQLQPHS S 
LQPETLCFYSSYSHILA 
YNQRPFASTAATATF 

GGTGAGCTATGGAGAGAACTTGGAGTGCTACAACCAGAGACCCTTTACTT 
I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I ' I' I I 
CCACTCGATACCTGTGTTGAACCTCACGATGTTGGTCTCTGGGAAATGAA 

GELWRQLGVLQPETIYL 
VSYGDNLECYNQRPFT 
R AMETTWSATTRDPLL 
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AGTCCGATACTACTGTGACGAATCCATGTAATAACGCAATAAACGCTATT 
I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCAGGCTATGATGACACTGCTTAGGTACATTATTGCGTTATTTGCGATAA 

VRYYCDESM. .RNKRY 
SDTTVTNPCNNA I NA I 
SPILL. RIHVITQ.TLL 

ACTGAGATAGCGACTCCGTGAGTTGACTGTAGAAGTTGCGGAGGAAGTCT 
i I I I I t I I I I ' ' ' I I ' ' ' I I I I ' I I ' I I ' I I ' I I 1 I ' I ' I ' ' ' ' I ' ' I ' I 
TGACTCTATCGCTGAGGCACTCAACTGACATCTTCAACGCCTCCTTCAGA 

Y.DSDSVS.L.KLRRKS 
TE I ATP VDCRSCGGSL 
LR RLRELTVEVAEEV 



TCAATAAAAGCTTANGTACATACATGGCCCACAACTATCGTTGACCGTGA 
I I ■ I I I t ' ' -I I ' ' ' ' I ' I ' ' I ' ' ' ' I I ' ' I I I ' I I I I ' I ' I ' ' I ' I ' I ' ' I 
AGTTATTTTCGAATNGATGTATGTACCGGGTGTTGATAGCAACTGGCACT 

S I KA7LHTWPTT I VDRD 
Q.KL7YIHGPQLSLTV 
FNKSL7TYMAHNYR.P. 

TCATATGCATCCATCAAATGTCCTCAAATGTCTTGGAGTAAGTAAATGCG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGTATAGGTAGGTAGTTTACAGGAGTTTACAGAACCTCATTCATTTACGC 

HMHPSNVLKCLGVSKC 
IICIHQMSSNVLE.VNA 
SYASIKCPQMSWSK.MR 
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TATTCGATCGGTAAAATGAAGATGTTAGAATAAATAAAATTAATTATTTT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATAAGCTAGCCATTTT ACTTCT ACAATCTTATTT ATTTT AATTAATAAAA 

VFDR.NEDVRINKINYF 
YSIGKMKMLE.IKLIIF 
IRSVK.RC.NK.N.LF 

TTTATAATTATAAAT ATTTT AAT AT ATTTTTTAATCTTAAAGATCCT AAA 

AAATATTAATATTTATAAAATTATATAAAAAATTAGAATTTCTAGGATTT 

Fl I INILIYFLILKILK 

L L. IF.YIF.S.RS. 

FYNYKYFN I FFNLKDPK 

AACCCAATTATAAGGATTTTATATATGGATTGGGATACTAAGAATATTTA 
I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I 
TTGGGTTAATATTCCTAAAATATATACCTAACCCTATGATTCTTATAAAT 

I .L.GFYIWIGILRIF 
KSNYKDFIYGLGY.EYL 
NL! IRILYMDWDTKNI 

;Bgl II 

ATTATAAAAATTAATATACTTTTTAATCTTAAAGATCTAATTATAAGTAT 
I I I I I I I I I I I I I I I I I I I I I' I ' I I I I I I I I I I I ' I ' I I I I I I I I I ' I I 
tAATATTTTtAAtTATATGAAAAATTAGAATTTCTAGATTAATATTCATA 

NYKN.YTF.S.RSNYKY 
IIKINILFNLKDLIISI 
L.KLIYFLILKI.L.V 

TTTCTATATGGATTGGGATATTAACTCGATTTACTTATAAAAATTTTAAT 

I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I 
XXXXXtXTXXXtXXXXXTXtXXTTXXXXTXXXTXXXTXTTTTTXXXXTTX 

FLYGLGY.LDLLIKILI 
FYMDWDINSIYL.KF. 
FSIWIGILTRFTYKNFN 

ATAAAAATTTTAAATTTAAAAATTAAAATACTAAAAATATCTAA ATATAA 

txttttTXXXXTTTXXXtTttTXXtTTTXTXXTTTTTXTXXXTTTXTXTT 

KF. I .KLKY.KYLNI 
YKNFKFKN.NTKNI I 

IKILNLKIKILKISKYN 
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CGGTAATCATGAGATCGAGAACGTGATGATTGAGATCATGAGATCGAGGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCCATTAGTACTCTAGCTCTTGCACTACTAACTCTAGTACTCTAGCTCCA 
TVIMRSRT. .LRS.DRG 
R.S.DRERDD.DHEIEV 
GNHEIENVMIEIMRSR 

TGAGAGTAAAAAGGAAATTACGTTAATCATGGGAAATTTCGTTTTGTTTG 
I I I I I ' I I ' I I ' I I I I ' I I I I I I I I ' I I I I I I I I I ' I I I I I ' I ' I ' ' I 
ACTCTCATTTTTCCTTTAATGCAATTAGTACCCTTTAAAGCAAAACAAAC 

E.KGNYVNHGKFRFVC 
ESKKEITLIMGNFVLF 
LRVKRKLR.SWEISFCL 

CACGGTCGAGATGGTGAGCGTGGACACCTAAGATCCACAACCGGGATGCA 
I I I I I I I I I I I I I I I I ' I I I I I I - M I ' I I I I I I I I I I ' I I I I I I I ' I I ' I 
GTGCCAGCTCT ACCACTGGCACCTGTG6ATTGTAGGTGTTGGCCGTACGT 

TVEMVTVDT HPQPAC 
ARSRW. PWTPN I HNRHA 
HGRDGDRGHLTSTTGMQ 

ATAACCATGTTGTCATATGTTAGCTTGTCTCATATCTTATGACCATGAAT 
I I I I I I I I I 1 I 1 I I I 1 I I I I I 1 I I I 1 I I I I I I I I I I t I I I I t I I I > I I I I 
TATTGGTACAACAGTATACAATCGAACAGAGTATAGAATACTGGTACTTA 

NNHVVIC.LVSYLMTMN 
ITMLSYVSLSHIL.P.I 
PCCHMLACLISYDH. 

CACATAGTCTTCACGAATATTAATTAAGCCAGCTTAGCATCAGAGTTTTG 
I I I I I I I I I I I I I I I 1 I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTGTATCAGAAGTGCTTATAATTAATTCGGTCGAATCGTAGTGTCAAAAC 

HIVFTNIN.ASLASQFC 
T.SSRILIKPA.HHSF 
SHSLHEY.LSQLSITVL 

CACGTTTGTACCATANCTGAAGTGTTCGTATGGCTTGACCCATCCCGAGT 
I I I I I I I I I I I I I I I I ' I I I I I I I I ' I ' I I I I ' I I I I I I I I I I ' I ' I ' I I 
GTGGAAACATGGTATNGACTTCACAAGCATACCGAACTGGGTAGGGCTCA 

TFVP7LKCSYGLTHPE 
APLYH7.SVRMA.PIPS 
HLCT I 7EVFVWLDPSRV 



FIG. 16C-2 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 991S668A2J_> 



wo 99/15668 PCT/US98/03343 i 

i. 

38/91 

GTATGGTCTCCCGGANCCTGGAGCGTGTTAACCCGAGG TCTAGTTGAGGG 
CATACCAGAGGGCCtNGGACCTCGCACAATTGGGCTCCAGATCAACTCCC 

CMVSR7LERVNPRSS.G 
VWSPG7WSVLTRGLVEG 
YGLP7PGAC.PEV.LR 

GCATAGACCTTGTTNTCTTAGGCAGAGGTTGAAGATCACTC CTTTAGCTA 
I 7 I I I I I I I I I I I I I I I I I I I I ' ' I ■ ' ' ' I ' ■ ' ' I ' I I ' I I I I I ' I ' I I t 
CGTAtCTGGAACAANAGAATCCGTCTCCAACTTCTAGTGAGGAAATCGAT 

A TL7S.AEVEDHSFSY 

HRPC7LRQRLKITPLA 

GIDLV7LGRG.RSLL L 

TCCGTTGGGTGCCTATATAAAGGTCGAAATCATGAGGGGGATTC NTAACT 
I I I I I I I I I I I I I I I I I '■ I I ' I ' I ■ I I I I I I I I I I 1 ' I I I I I I I I I I 
AGGCAACCCACGGAtATAttTCCAGCTTTAGTACTCCCCCTAAGNATTGA 

PLGAYIKVEIMRG17N 
IRWVPI.RSKS.GGF7T 
SVGCLYKGRNHEGDS L 

CGACCTATTCAATATTTGAGCTAGCAAGAGTTGGAGTTAC GTGTATGAGG 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! '' '1 
GCTGGATAAGtTATAAACTCGATCGTTCTCAACCTCAATGCACATACTCC 

STYS I FELARVGVTCMR 
RPIQYLS.QELELRV.G 
DLFNI .ASKSWSYVYE 

TTCGAGGCCCAATGCTGTTCCTGGGGTCGTTTTATA CCTATTCCTGCATC 

AAGCTGGGGGtTACGACAAGGACCCCAGCAAAATATGGATAAGGACGTA6 

FDPQCCSWGRFYTYSCM 

STPNAVPGVAF 1 P I PA 
VRPPMLFLGSLLYLFLH 

GTGATCATACATAGTAGCTTTAATCATCTTCAGTCA TCATCGTACGTTGG 

CACTAGTAt6tATCAtCGAAAttAGtAGAAGTCAGTAGTAGCATGCAACC 

SYIVALI IFSHHRTL 
C DHT..L.SSSVIIVRW 
VI IHSSFNHLQSSSYVG 
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GTGCAT GCATTGTCTAATTTACTCGATTCAATNTCGTTCGACACTGCTTC 

I I I I I I I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CACGTACGTAACAGATTAAATGAGCTAAGTTANAGCAAGCTGTGACGAAG 
GACIV.FTRFN7VRHCF 
VHALSNLLDS7SFDTAS 
CMHCL I YS I Q7RSTLL 

'Xhol 

ctac ctactatgtggcccaatacatagttgtattgtctcatacggcc'tcg 

I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

gatggatgatacaccgggttatgtatcaacataacagagtatgccggagc 
lptmwpnt.lyclirpr 
yllcgpihscivsygl 

PTYYVAQY I VVLSHTAS 

agcaaa gcgtgtgcagaggaactgtgtcaagtggttggctggcctcgggc 
tcgtttcgcacacgtctccttgacacagttcaccaaccgaccggagcccg 

AKRVQRNCVKWLAGLG 

eqsvcrgtvs sgwlasg 
skacaeelcqvvgwpra 

TCATGGCATTGAGTTGGCTCGATACAACACATCGGCTTAGGGATACCATG 

agtaccgtaactcaaccgagctatgttgtgtagccgaaTccctatggtac 

lmalswldtthrlrdtm 
swh.vgsiqhiglgipc 
hgielaryntsa.gyh 

ccgagtctattgtggtagttgacatgtcatgtggggtggatgccaaaata 

M ' I I ' I ' I I I I I I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ggctcagttaacaccatcaactgtacagtacaccccaxctacggttttat 
psllw.ltchvgwmpky 
rvycgs hvmwggcqn 
aes i vvvdmscgvdak i 

tgctatatcattctctccctacaaaggagttgtgccataggagaatggtg 

I I I' I ' I I I I I I I I I I I I 1 I I 1 I I I ' I ' I I I I 1 I I I I I I I I I I I I I I I I I 
A C G a T a T A G T A A G A G A G G G A T G T T T C C T C a A C A C GGTATCCTCTTAGCAC 

aisfspykgvvp.enr 
mlyhslptkelchrriv 
CYI ilslqrscaigesw 
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GACACGGCTTGGGTTCTGTGGTCGGTCCTTGTTCGCCTCAGTTGGGTGGA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTGTGCCGAACCCAAGACACCAGCCAGGAACAAGCGGAGTCAACCCACCT 

6HGLGSVVGPCSPQLGG 
DTAWVLWSVLVRLSWVD 
TRLGFCGRSLFASVGW 

TTACTTCATCAAGTTGGCCNTCTGTTGGCTGGGCAAAGTACACTTGGTAG 

I I I I I I I I I I — I I I I I I I I 1 I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 

AATGAAGTAGTTCAACCGGNAGACAACCGACCCGTTTCATGTGAACCATC 

LLHQVG7LLAGQSTLGR 
YF I KLA7CWLGKVHLV 
ITSSSWPSVGWAKYTW. 

GGATGGTCGAGACAAGNCCAAGGAAGGTTGGCTAAGACTTGGTTTTCGAC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCTACCAGCTCTGTTCNGGTTCCTTCCAACCGATTCTGAACCAAAAGCTG 

DGRDK7KEGWLRLGFR 
GMVET7PRKVG DLVFD 
GWSRQ7QGRLAKTWFST 

AATCAATTGTTTATGAGGCGAATGGTATCCCTCCGTTGGGGTGTCTGCTC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTAGTTAACAAATACTCCGCTTACCATAGGGAGGCAACCCCACAGACGAG 
QS I VYEANGI PPLGCLL 
NQLFMRRMVSLRWGVCS 
INCL.GEWYPSVGVSA 

GTTTCGATTTGTTGCGATGGATTGTTTGTTGTAGGAGGCTTGGTTCGATT 

I I I I I I I I I [ — I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAAAGCTAAACAACGCTACCTAACAAACAACATCCTGCGAACCAAGCTAA 

VS I CCDGLFVVGGLVRL 

FRFVAMDCLL EAWFD 

RFDL LRWI VC'CRRLGS I 

GCTCTTAAGTCGGGAGAAGGTATTTGNTAAGGAGTTCAATTTGACCATGT 
I I I 1 I I I I I I I I I I I I ' I ' I I I ' I I ' I ' I I I I I ' I I I I ' I I I I I I I I ' I I 
CGAGAATTCAXCCCTCTTCCATAAACNATTCCTCAAGTTAAACTGGTACA 

LLSREKVF7KEFNLTM 

CS .VGRRYL7RSSI.PC 

ALKSGEGI7.GVQFDHV 
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ACTTCACTTATTTTCCTGAACGGTTCTTCAAACCGAGCTGGCACAATTTC 
LK. IKGLAKKFGSTVLK 
SE.KDLPRSLARPC.S 
EVNKRTCQEVWLDRVK 

CCAGAGAATGTGTATGTCGAGGTCTATTCAACCATGTGGAAGCTAGAGAA 

I I I I I I i I I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I ' I ' I I I I 
GGTCTCTTACACATACAGCTCCAGATAAGTTGGTATACCTTCGATCTCTT 

PENVYVEVYSTMWKLEN 
QRMCMSRS I QPCGS R 
A RECVCRGLFNHVEARE 

TGCACCAATTGTGAGGTTTGGCTTGCTCACGTTTAAAGCAGAAGGATATA 
I I I I I I I I I I I I I I I I I I I I I 'I I I ' I I I I I I I I I I I I I I I I ' I I I I I 
ACGTGGTTAACACTCCAAACCGAACGATTGCAAATTTCGTCTTCCTATAT 

AP I VRFGLLTFKAEGY 

MHQL.GLACSRLKQKDI 

CTNCEVWLAHV.SRRIY 

CTTGCTACGAGGTTTGCTCAACCATGTGGAAGCAATCAAATGCACTTGCT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I ' I I ' I I ' I ' I' I I 
GAACGATGCTCCAAACGAGTTGGTACACCTTCGTTAGTTTACGTGAACGA 

TCYEVCSTMWKQSNALA 

LATRFAQPCGSNQMHL L 

LLRGLLNHVEAIKCTC 
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ATGAGGTTTGGCTTGACTTACTCGACAATGGACGCTNGTAAGTGAGAAGG 

1 i J> .1 i I ' ' ' I ' ' ' ' I ' ' ' ' I I ' I I I I I I I I I I I I I ! I I ' I I I I I I I I t I I 

TACTCCAAACCGAACTGAATGAGCTGTTACCTGCGANCATTCACTCTTCC 
MRFGLTYSTMDA7K.EG 
GLA LTRQWTLVSEK 
YEVWLDLLDNGR? VRR 

jSpel 

GACTANCCAAGACTTAGTTGGCAAGGACTAGTCGATACTTGCTCGACAAT 

I ' I M I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ) I I 

CTGATNGGTTCTGAATCAACCGTTCCTGATCAGCTATGAACGAGCTGTTA 

T7QDLVGKD S I LARQ 

GL7KT LARTSRYLLDN 
D7PRLSWQGLVDTCST I 

jSal I 

agatgcctataggtaatggattgactgagacttag'tcgacaaagactagc 

M I M I I I I I I I ' I I I I I I I I r I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

tctacggatatccattacctaactgactctgaatcagctgtttctgatcg 
mpigngltet.stkts 
rcl vmd. lrlsrqrla 

DAYR .Wl D DLVDKD. 



jXhol 

tgagacttagtgggcaatggatgcctataagtaagaaaggatggc'tcgag 

I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I 

actctgaatcacccgttacctacggatgttcattctttcctaccgagctc 
dlvgngcl vrkdgsr 
et.wamdayk.ermar 
lrlsgqwmp ! skkgwle 

attaataaagatcaaataattaatataaatttatcaaacacttaatggac 

I I I' I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I 
TAATTATTTCTAGTTTATTAATTATATTTAAATAGTTTGTGAATTACCTG 

LIKIK.LI lYQTLNG 
D..RSNN.YKF1KHLMD 
INKDQI ININLSNT.WT 

GCATATAAGTGAGAAAGGACGGATCGAGATTAATAAAGATCAAATAATTA 
I I I' I I' I ' I I I I I I ' I I ' I ' I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I 
CGTATATTCACTCTTTCCTGCCTAGCTCTAATTATTTCTAGTTTATTAAT 

RI.VRKDGSRLIKIK.L 
AYK.ERTDRD..RSNN. 
HISEKGRIE INKDQI I 
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ATATAAGT-rT/\ TCAAACNCTTATTAANACATTGGACAAAAGAGGTACTAT 

I I I I I I I I I I I I ' M I I I I I I ' I I 1 I I I I I I I I I I I I 1 I I I I I I I I I I I I 

TATATTCAAATAGTTTGNGAATAATTNTGtAACCtGTTTtCTCCAf GATA 
I VYQTLI7TLDKRGTM 
YKF I K?LL?HWTKEVL 
NISLSN?Y.?IGQKRYY 

GTAATATTAAAATTGGGAGGCACAAATATTATTTCCAAATACTTTTCTCC 

I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I M 

CATTATAATTTTAACCCTCCGTGTTTATAATAAAGGTTtATGAAAAGAGG 

Y.NWEAQILFPNTFL 
CNIKIGRHKYYFQILFS 
VILKLGGTNIISKYFSP 

TTAA GCCCTTCGCCACCATTGCCATTTTAATCTATTTTTTCTATATAATT 

I I I I I I I I I I I I I I I I I I I I I I I I 1 i I I I I I I I I I I I I I I I I I I I I I I ( ,| 

AATTCGGGAAGCGGTGGTAACGGTAAAATTAGATAAAAAAGAtAtAtTAA 
LKPFATIAILIYFFYII 
LSPSPPLPF.SIFSI.L 
ALRHHCHFNLFFLYN 

ATCNCATAACATTCGTACATGAGATATGACATAAACCTTCGACGTGCTTT 

I ' I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TAGNGTATTGTAAGCATGTACTCTATACTGTATTTGGAAGCTGGACGAAA 
l?.HSYMRYDINLRPAL 
SHNIRT.DMT.TFDLL 
Y7ITFVHEI.HKPSTCF 

AGTAAACATNTTGATTATNGTGACACCAGAAGCCATAATATTGCTTACCT 

J. ' i ' I '' ' ' I ' ' ' I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCATTTGTANAACTAATANCACTGTGGTCTTCGGTATtAtAACGAATGGA 
VN?LI?VTPEAIILLT 
T?.L?.HQKP.YCLP 
SKH?DY?DTRSHNIAYL 

TAACATGATGGAGATGAACTTTAGTTGGTCCAANTATCTAATNAATGGAA 
I ' I I I I ' I I I I I ' I I ' I I I I I I I I I I I I I I I I I I I ' I ' I I I ' I I I I I I I I 
ATTGTACTACCTCTACTTGAAATCAACCAGGTTNATAGATTANTTACCTT 

LT.WR.TLVGP?I.?ME 

HDGDEL LVQ?SN?WK 

NMMEMNFSWS?YL?NG 
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GTGGACAAGCACGATGACTAGGATGGCTACATGTTCATGTGTTGACTTTC 
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CACCTGTTCGTGCTACTGATCCTACCGATGTACAAGTACACAACTGAAAG 

VDKHDD.DGYMFMC.LS 
WTSTMTRMATCSCVDF 
SG QAR LGWLHVHVLTF 

CAAGTAATCAATCAAGCTGGAATCGAATAAGACGATTAAAGTAGGGCGAT 

I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I r I I I I I I I 
GTTCATTAGTTAGTTCGACCTTAGCTTATTCTGCTAATTTCATCCCGCTA 

K.SIKLESNKTIKVGR 
PSNQSSWNRIRRLK.GD 
QVINQAGIE.DD.SRAM 

GACCATTAAGTTCAATGTCACGCTCATCAACATAATTCCAACACCGTGCA 
I I ) I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTGGTAATTCAAGTTACAGTGCGAGTAGTTGTATTAAGGTTGTGGCACGT 

PLSSMSRSST FQHRA 
DH VQCHAHQHNSNTVQ 
TIKFNVTLINIIPTPC 

jBgill 

GAAAGATCTTATCTTACATTGACTTGCCCATCCGGCCGCCGGCATCGATT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I ' I 
CTTTCTAGAATAGAATGTAACTGAACGGGTAGGCCGGCGGCCGT AGCTAA 

ERSYLTLTCPSGRRHRL 

KDL I LH. LAHPAAGID 

RKILSYIDLPIRPP ASI 
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; EcoR I 

GGCGGAAACGAAGGGTCAGTCTCCCAATTCACATTCAAAGGACGAATTCA 
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CCGCCTTTGCTTCCCAGTCAGAGGGTTAAGTGTAAGTTTCCTGCTTAAGT 

AETKGQSPNSHSKDEF 
WRKRRVSLPIHIQRTNS 
GGNEGSVSQFTFKGRiH 

TTTTCATCAGATGAGCACTTCAGTCCTGCTTGATTATATTTTATTATTAT 

I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAAAGTAGTCTACTCGTGAAGTCAGGACGAACTAAT ATAAAATAATAATA 
IFIR.ALQSCLI IFYYY 
FSSDEHFSPA.LYFIII 
FHQMSTSVLLDY I LLL 

TATTATTATTAATTGAATGGTAAGTTTACAGAATATATAGATATTTTAGT 
I I I I I I 'I I I I I ' I I I I I ' I ' I ' I I I I I I I ' I I I I I I I I I I ' I I I I I I I I 
ATAATAATAATTAACTTACCATTCAAATGTCTTATATATCTATAAAATCA 

YYY.LNG KFTEYIDILV 

I I IN.MVSLQNI IF. 

LLLLIEW.VYRIYRYFS 

TTCAATAAAATATTTTAAAAAATGATAAAGGGAGAAGGTGGATTTGATCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGTTATTTTATAAAATTTTTTACTATTTCCCTCTTCCACCTAAACTAGA 

SIKYFKK. .REKVDLI 
FQ. N I LKNDKGRRWI S 
FNKIF.KMIKGEGGFDL 

TAGGATTTTTATTGTGAGCAATAAAAGTCTTTAGTTAGAACTTCCAAAAT 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCCTAAAAATAACACTCGTTATTTTCAGAAATCAATCTTGAAGGTTTTA 
LGFLL.AIKVFS.NFQN 
DFYCEQ. KSLVRTSKM 
RIFIVSNKSL.LELPK 

GTGTCAAATGAACCCTAATAAGTGGGTTTGGTCTATGGTTACGATGAGAT 

I I I I I I I I t I I I I I i I I I I I I I I I I I I I I I I I [ I I I I I I I I I I I I I I I I I 

CACAGTTTACTTGGGATTATTCACCCAAACCAGATACCAATGCTACTCTA 
VSNEP VGLVYGYDE I 

CQMNPNKWVWSMVTMR 
CVK.TLISGFGLWLR.D 
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CAGTATTTGT ATATAAAAAAATTAJCAACTTGATTTTTATTTTTTAACCC 
Jl-lj>i-liiij>l-liJ.'' '' ' 'l '''' ' '' i|i i ii|i i ii|i i ii|iiii | 
GTCATAAACATATATTTTTTTAATAGTTGAACTAAAAAtAAAAAAttGGG 

SICI.KNYQLDFYFLT 

SVFVYKKIINLIFIF.P 

QYLYIKKLST.FLFFNP 

TTAA TAAGTGGACATGATATATCATAATCAAATGATGTGATGTNTGATGA 

AATTATTCACCTGTACTATATAGTATTAGTTTAGTACACtACANACTACt 
LNKWT.YI I IKSCDV. 
LISGHDIS.SNHVM7DE 
VDMIYHNQIM .C7M 

GTNATAACATATTTTTTAATAATNAAAATTATNAATAGAGAAAAAATAAG 

I I ' I I ' I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 

CANTATTGTATAAAAAATTATTANTTTTAATANTT ATCTCttttttATTC 
VITYFLI?KI?NREKIR 
? .HIF..?KL?IEKK. 
S?NIFFNN?KY?.RKNK 

ATTACTATCCCTTCTATNGATGTNTTATAATATTTTAATCCCTTTCNATA 
i«i-l-li.!. ' JLlI '' 'l ''' ' l' ' ''l ' ' ' ' | ii ' i |ii ii|i i ii|i i ii | 
TAATGATAGGGAAGATANCTACANAATATTATAAAATTAGGGAAAGNTAT 

LLSLL?M?YN I L I PF'' 

DYYPFY?C?I1F.SLSI 

ITIPS?DVL.YFNPF?Y 

TAGATTCACGTAGAATAAGAAAGATTATAATCGCATCAAATCAAATACAG 
I I I ' I I ' I I I I I I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I i I I I I I I I 
ATCTAAGTGCATCTTATTCTTTCTAATATTAGCGTAGTTTAGTTTATGTC 

IDSRRIRKI I lASNQIQ 

IHVE.ERL.SHQIKYR 

RFT.NKKDYNRIKSNT 

AATNAAATCATGCTTTTGACTTAATTCGAAAAATAATCTTCCTCTCTTGA 
I I I I I ' I I ' I I ' I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I 
TTANTTTAGTACGAAAACTGAATTAAGCTTTTTATTAGAAGGAGAGAACT 

N7IMLLT.FEK.SSSLD 
?KSCF LNSKNNLPLL 
E7NHAFDLIRKIIFLS. 
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TAAfATCCTT ATTGATAAGCATTNTTATATATATATATATNTATATCAAC 

I I I ' I ' I I I ' I ' I I I I I I I I ■ I I ■ I i' I I t r I I I I I I I I I I I I I I 1 I I 

ATTATAGGAATAACTATTCGTAANAATATAtATAtATATANAtATAGftG 

NILIDKH?YIYIY'7YQ 
IISLLISI?IYIY?YIN 
YPY. .A?LYIYI?IST 

TT9T^AAANATATTTTTAAATT AATTAAATTTATCAAAATAAAAAGATAA 

1 j '' I '' ' J. ' i ' 1 ' I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I 
AAGATTTTNTATAAAAATTTAATTAATTTAAATAGTTTt AtTTttCTAt t 

LLK7IFKLIKFIKIKR. 
F -?IFLN.LNLSK.KDK 
SK?YF. IN. lYQNKKI 

ACTAAATTAGTTCTG CATCATAATGTAGTAAGTGTAAGAACTTGTGAAAT 

I I I ' I I I I ' I I I I I I I I I I I I I I I i I I I I I I I I I I I I r I I I I I I I I I I I I 
TGATTTAATCAAGACGTAGTATTACATCATTCACAttcttGAACACtttA 

TKLVLHHNVVSVRTCE I 
LN.FCIIM..V.ELVK 
N. ISSAS.CSKCKNL.N 

j Xba I j spe I 

anggatctagaacactgatagaaaattccaaaccatta'ctagttctactt 

I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 
TNCCTAGATCTTGTGACTATCTTTTAAGGtTTGGtAATGAtCAAGAtGAA 

? I NTDRKFQT I TSST 

?GSRTLIENSKPLLVLL 
?DLEH. .KIPNHY.FYL 
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GATGAAAACAAAACCATATAAAAGAATCCTCTTATATATATATATATATA 
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CTACTTTTGTTTTGGTATATTTTCTTAGGAGAATATATATATATATATAT 

KQNHIKESSYIYIYi 
DENKTI .KNPLIYIYIY 
MKTKPYKRILLYIYIY 

TATACTACTTTACTTATTCTTTGGACGTACAACACAAGTCAGGAAACCGA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATATGATGAAATGAATAAGAAACCTGCATGTTGTGTTCAGTCCTTTGGCT 

YTTLL I LWTYNTSQETE 
I LLYLFFGRTTQVRKP 
lYYFTYSLDVQHKSGNR 

AACAAAGGTGGCGGAAAGTTGGCAGANGCTGAAGAGACTTTTCGTAGAAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTGTTTCCACCGCCTTTCAACCGTCTNCGACTTCTCTGAAAAGCATCTTC 

TKVAESWQ7LKRLFVE 

KQRWRKVGR? RDFS K 

NKGGGKLA7AEETFRRS 



TGAAGGAGACACACGTCTATAAGAATTGTCATGACTATACGCTGAAGAAA 
I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I ' I ' I I I I I M I I 
ACTTCCTGTGTGTGCAGATATTCTTAACAGTACTGATATGCGACTTCTTT 

VKETHVYKNCHDYTLKK 
RRHTSIRIVMTIR.RK 
EGDTRL.ELS.LYAEE 

AAGAGGGGAGAGAGAGAGAAGGAAGCGCCACTGTTGACCGGTCTTGTCCA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I ' I ' I I ■ I ' I I I ' I I 
TTCTCCCCTGTCTCTCTCTTCCTTCGCGGTGACAACTGGCCAGAACAGGT 

KRGEREKEAPLLTGLVH 
RGERERRKRHC PVLS 
KEGREREGSATVDRSCP 

i Sal I ; Sal I 

TGAGGAATTGTTTGTCGACTAATGAGCAGTACAAACATTTGTGTCGACAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I 
ACTCCTTAACAAACAGCTXATTACTCGTCATGTTTGTAAACACAGCTGTC 

EELFVD. .AVQTFVST 

MRNCLSTNEQYKHLCRQ 

GIVCRLMSSTNICVDR 
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ATGGCAACAAATGAGAAGCGGTATCCCAACACGCAATCTGTAGCCTTTGG 
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TACCGTTGTTTACTCTTCGCCATAGGGTTGTGCGttAGACATCGGAAACC 
DGNK EAVSQHA I CSLW 

MATNEKRYPNTQSVAFG 
WQQMRSG I PTRNL PL 

TCNCCAGACTTATCCAAAGACTTGCCTCTGCGATTTCCTCATGCGCCTCA 

M ' I M I' I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGNGGTCTGAATAGGTTTCTGAACGGAGACGCTAAAGGAGTACGCGGAGT 

SPDLSKDLPLRFPHAPH 
?QTYPKTCLCDFLMRL 
V?RL iQRLASA I SSCAS 

iHindlll 

TCTGTTCCAAAGGAAGCTTCACAGCGGGCAGGAATCCATTTCTCTATATA 
I I ' M I I ' I I I I I I I I I I I I I I I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I 
AGACAAGGTTTCCTTCGAAGTGTCGCCCGTCCTTAGGtAAAGAGAtAtAt 

LFQRKLHSGQES I SLY 
ICSKGSFTAGRNPFLYI 
SVPKEASQRAGIHFSI 

AGCACCACCTCCCACCCACACCACCACCACCACCACCACTGCTAAGGAGG 

i l-i-ll-lJll I I ' I I I ' I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I ' I ' I I 

TCGTGGTGGAGGGTGGGTGTGGTGGTGGTGGTGGTGGTGACGATTCCTCC 
KHHLPPTPPPPPPLLRR 
STTSHPHHHHHHHC.GG 
APPPTHTTTTTTTAKE 

ATGAAGGCCTTGTTGCTGGTCATTTTTACGCTGGCCTCGTCGCTCGGCGC 

I I I M I 1 ' I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 
TACTTCCGGTACAACGACCAGT AAAAATGGGACCGGAGCAGCGAGCCGCG 

MKALLLV I FTLASSLGA 

RPCCWSFLPWPRRSA 

DEGLVAGHFYPGLVARR 

CTTCGCCGAGCAATGCGGAAGGCAAGCCGGGGGGGCTCTCTGCCCCGGCG 

I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAAGCGGCTCGTTACGCCTTCCGTTCGGCCCCCCCGAGAGACGGGGCCGC 

FAEQCGRQAGGALCPG 
PSPSNAEGKPGGLSAPA 
LRRAMRKASRGGSLPRR 
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CCGACACGACATCGGTCATGCCGACCACGCCATTGTGCCTAGGTANGACG 
GLCCSQYGWCGNTDP7C 
GCAVASTAGAVTR I H?A 
AVL PVR LVR, HGS?L 

GGTCAAGGATGCCANANCCAATGCNCANGCTCCACGCGCTCCCCTTCCAC 

I I I I I I I r I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCAGTTCCTACGGTNTNGGTTACGNGTNCGAGGTGCGGGAGGGGAAGGTG 

GQGC??QC??STPSPST 
VKDA??NA?APRPPLP 
RSRMP?PM??LHALPFH 

TCCGAGCGGCGGTGGCANNGTTGGCTCGATCATCATCTCCTCCCTCTTCN 

I I I I I I — I I I I I I t I — I I I I I I I I I I I I I I I I I I I — 1 — I — I — I— I—' — I I I I I I I I I 

AGGCTCGCCGCCACCGTNNCAACCGAGCTAGTAGTAGAGGAGGGAGAAGN 

PSGGG7VGSIIISSLF 
LRAAVA7LARSSSPPSS 
SERRW??WLDHHLLPL? 

AGCAGATGCTGAAGCATCNCANCGACNCAGCCNGCCCCGGCAANGGCTTC 

I I I I I I I I I — I — I I I — I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCGTCTACGACTTCGTAGNGTTGCTGNGTCGGNCGGGGCCGTTNCCGAAG 
?QMLKH??D?A?PG?GF 
SRC S I ??TQPAPA?AS 

ADA EAS?R?S?PRQ?L 
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TACNCGTNCACCGCCTTCATCTCCGCCGCCANCTCCTTCANCGGGTTCGG 
I I I I I I I I I I I I I I I ' I I I I I I I I I I I I ' I I ' I ' I I I I I I I I I I I I I I I I 
ATGNGCANGTGGCGGAAGTAGAGGCGGCGGTNGAGGAAGTNGCCCAAGCC 

Y??TAF i SAA?SF?GFG 
TR?PPSSPPP?PS?GS 
L?VHRLHLRR?LL?RVR 

GACNACCNGCGACCACTCCACNAATAANANGGANATCNCGGCTTTCTTGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I r I I I I I I I 

CTGNTGGNCGCTGGTGAGGTGNTTATTNTNCCTNTAGNGCCGAAAGAACC 

TT7DHSTN??? I ?AFL 
G7PATTP? 1 ???SRLSW 
D??RPLH? ?G??GFLG 

TNCNGACNTCTCNCGAGACNACANGTAATGCNTNCNTCTCCCGAGGCTCG 
I I I I I I I I I I I ' I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I 
ANGNCTGNAGAGNGCTCTGNTGTNCATTAGGNANGNAGAGGGCTCCGAGC 

V?TS?ETT?NP??SRGS 
???L?R??VI??SPEAR 
?D?SRD?? S??LPRL 

TCTNCAGNTTATNGATAGACANCTNAATGCATTGGGTTNGGCACGTGGGT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGANGTCNAAT ANCTATCTGTNGANTTACGTAACCCAANCCGTGCACCCA 
S??Y? T??C I G7GTWV 

LQ??DR?LNALG?ARG 
V?? L? I D??MHWV?HVG 

GGTCCACCGTGCCCNATGGCCNTTCGCGTGGGGTTAGTGCTTCGTCCAGN 
I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCAGGTGGCACGGGNTACCGGNAAGCGCACCCCAATGACGAAGCAGGTCN 

VHRA7WPFAWGYCFVQ 
WSTVP?G?SRGVTASS? 
GPPCPMA7RVGLLLRP? 

AACAGAACCCTCATCGGACTACTGCGTCGCCAGCTCGCANTGGCCGTGCG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTGTCTTGGGAGTAGCCTGATGACGCAGCGGTCGAGCGTNACCGGCACGC 

?QNPHRTTASPAR?GRA 
NRTL IGLLRRQLA7AVR 
TEPSSDYCVASS7WPC 
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CTGCANGCAANAAATACTACGGCCGAAGCCCCATCCAAATCTCATTCAAC 

I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GACGTNCGTTNTTTATGATGCCGGCTTCGGGGTAGGTTTAGAGTAAGTTG 
L?A?NTTAEAPSKSHST 
C?Q? I LRPKPHPNL IQ 
AA??KYYGRSPIQISFN 

TACAACTACGGGCCGGCCGGGAAAACCATCGGCTCCGACCTGCTCAACAA 

I I I I I I ' I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATGTTGATGCCCGGCCGGCCCTTTTGGTAGCCGAGGCTGGACGAGTTGTT 

TTTGRPGKPSAPTCST 
LQLRAGRENHRLRPAQQ 
YNYGPAGKT IGSDLLNN 

CGCAGACCTGGTGGCCACCGACCCGACCATCTCCTTCAAGACGGCTCTGT 

M I I I I I I ' I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGGTCTGGACCACCGGTGGCTGGGCTGGTAGAGGAAGTTCTGCCGAGACA 

TQTWWPPTRPSPSRRLC 
PRPGGHRPDHL LQDGSV 
PDLVATDPTISFKTAL 

GGTTCTGGATGACTCCTCAGTCGCCCAAGCCGTCGTGCCACGACGTGATA 
I 'I I I I ' I ' I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCAAGACCTACTGAGGAGTCAGCGGGTTCGGCAGCACGGTGCTGCACTAT 

GSG.LLSRPSRRATT.. 

VLDDSSVAQAVVPRRD 

WFWMTPQSPKPSCHDV I 



ACCGGGAGCTGGACGCCATCCAACGCCGACCGGGCGGCCGGAAGGGTTCC 
I I ' I I I I I I I I I I I I ' I ' I I ' I' I I I I I 1 I I ' I I I I I I I I I I I I I I I I I I 
TGGCCCTCGACCTGCGGT AGGTTGCGGCTGGCCCGCCGGCCTTCCGAAGG 

PGAGRHPTPTGRPEGF 

NRELDAIQRRPGGRKAS 

TGSWTPSNADRAAGRLP 



GGGCTACGGTGTCACCACCAACATCATCAATGGAGGGTTGGAGTGCGGGA 
I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCCGATGCCACAGTGGTGGTTGTAGTAGTTACCTCCCAACCTCACGCCCT 

RATVSPPTSSMEGWSAG 

GLRCHHQHHQWRVGVRE 

GYGVT TNI INGGLECG 
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AAGGGTCCGATGCCAGGGTGGCGGATAGGATCGGCTTCTACAANAGGTAC 
I I I I I I I I I I I ' I I I I I I I I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I 
TTCCCAGGCTACGGTCCCACCGCCTATCCTAGGCGAAGATGTTNTCCATG 

KGPMPGWR1GSAST7GT 
RVRCQGGG DRLLQ7V 
KGSDARVADRiGFY?RY 

TGCGACTTGCTGGGGGTGAGCTACGGAGACAACTTGGACTGCTACAACCA 
I I I I I I I I I I I I ' I I ' I I I I I I ■ I I I I I I I 1 I I I I I I 1 I I I I I I I I I I I I 
ACGCTGAACGACCCCCACTCGATGCCTCTGTTGAACCTGACGATGTTGGT 

ATCWG ATETTWTATT 
LRLAGGELRRQLGLLQP 
CDLLGVSYGDNLDCYN? 

NAGTCCCTTTACTTANTCCGATACTATGTGCGAATCCATGTAATAACGCA 

I I I I I I I I I I I I I I I I ' I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I i I I I i 
NTCAGGGAAATGAATNAGGCTATGATACACGCTTAGGT ACATTATTGCGT 

?VPLL?RILCANPCNNA 
?SLYL?RYYVRIHVITQ 
SPFT*SDTMCESM. .R 

ATAAACGCTACTGCTGAAATAGCGACTCCGTGAGTTGATTGTAGAAGTTG 
I I I I I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TATTTGCGATGACGACTTTATCGCTGAGGCACTCAACTAACATCTTCAAC 

INATAEIATP.VDCRSC 
TLLLK.RLRELIVEV 
NKRYC.NSDSVS.L.KL 

i POLY A 

cggaggaaatcttcaataaaagctaagctgaac'aagttcatggccctcaa 

I I I I I I i I I I I I I I I I ' I I I ' I I I I ' I I ' I I I ' I I I I ' I I I I ' I ' I I 

gcctcctttagaagttattttcgattcgacttgttcaagtaccgggagtt 

ggnlq. kls tsswps 
aeeifnks.aeqvhgpq 
rrkssikaklnkfmaln 

tcatcgttgatcgtcgtcagatgcatccatcaaatgtcttggagtnagtn 

I I I I I I I 1 I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

agtagcaactagcagcagtctacgtaggtagtttacagaacctcantcan 
i ivdrrqmhpsnvle7v 
ssl i vvrc i hqmsws?? 
hr. sssdas i kclgvs 
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AATGCGTTTTCNATCGGTAAATTGAAGATGTTAGAATAAATAAAATTATT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I [ I I I I I I I I I I I I I I I I I I I 

TTACGCAAAAGNTAGCCATTTAACTTCTACAATCTTATTTATTTTAATAA 

NA7SIGKLKMLE. IKLF 
MR??SVN .RC.NK.NY 
?CVF?R. lEDVRINKI I 

TATTTTTTATAATTATAAATATTTTAATATATTTTTTAATCTTAAAGATC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATAAAAAATATTAATATTTATAAAATTATATAAAAAATTAGAATTTCTAG 

IFYNYKYFNIFFNLKD 
LFFI I INILIYFLILKI 
YFL.L. IF.YIF.S.RS 

CTAAAAAATCTNATTATAAGGATTTTATATATGGATTGGGATACTAANAA 

I I I I I I I I I I I I I I I I I I I I r I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GATTTTTTAGANTAATATTCCTAAAATATATACCTAACCCTATGATTNTT 

PKKS7YKDFIYGLGY.? 
LKNLIIRILYMDWDT7K 
KI7L.GFYIWIGIL? 

jBamH I 

aanttnattatnaaaattaatatacttttaatcttaag'gatcctaaaaaa 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I 

ttnaantaatanttttaattat atgaaaattagaattcctaggatttttt 

77i7kinillilrilkk 
77l7kliyf.s.gs.k 
k77y7n ytfnlkdpkk 

acataattataaggattttctatatggatngggatactaacaanatntaa 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I 

tgtattaatattcctaaaagatatacctanccctatgattgttntanatt 

hnykdflyg7gy.q77 
niiirifymd7dtn77. 
t.l.gfsiw7gilt77 n 

ttgtaaaaatttnaatataaaattgttaaatctaaaaattaaaatactaa 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I ■ I- I 

aacatttttaaanttatattttaacaatttagatttttaattttatgatt 

ivki7i .nc. i .klky. 
l.kf7ykivkskn.ntk 
ckn7nikllnlkikil 
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iXhol 

jEcoRV iBglll 

aaatatatantaatcatgat'atcgagaatgtggcgcttagatctcgagat 

I I I I I I I I ' I I I ' I I I I ' I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTTATATATNATTAGTACTATAGCTCTTACACCGCGAATCTAGAGCTCTA 

KYI7IMISRMWRLDLEI 
NI7.S.YRECGA ISR 
KIY7NHDIENVALRSRD 

CGAGGTTGAGACTANAGNGGAAATTATGTTAATCATGGGAAATTTTCTTT 
I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I ' I I 
GCTCCAACTCTGATNTCNCCTTTAATACAATTAGTACCCTTTAAAAGAAA 

EVET??EIMLIMGNFL 
SRLRL??KLC.SWEIFF 
RG.D??GNYVNHGKFSF 

TGTTTCCAAGACGATGACCGTGGAAACCTAACATCCGCAATCGGTCATGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACAAAGGTTCTGCTACTGGCACCTTTGGATTGTAGGCGTT AGCCAGTACG 
LFPRR PWKPN I RNRSC 

CFQDDDRGNLTSA I GHA 
VSKTMTVET HPQSVM 

AATAACCATGTTATCATCANTGAACTTGTCGTCGTCATCTTACGGCCACA 
I i I I I I I I I I I I I I I I I I I I I I I I [ I I I I I I I I I I I I I I I I I I I I I I ' I I 
TTATTGGTACAAT AGTAGTNACTTGAACAGCAGCAGTAGAATGCCGGTGT 

NNHVI I7ELVVVILRPQ 
I TMLSS7NLSSSSYGH 
Q. PCYH? TCRRHLTAT 

AATCACAGTCTTCTANCAAGGCACGAATATTAATGAGTCCAAGCTAGTAT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I 
TTAGTGTCAGAAGATNGTTCCGTGCTTATAATTACTCAGGTTCGATCATA 

ITVF7QGTNINESNVV 
KSQSS7KARILMSPT.Y 
NHSLL7RHEY. .VQRSI 

CTATATTGTTTTACATTTTATACCGTANTCGAGGTGTTCGCACGATTTTG 
I I I I I I' I I I I ' I I I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GATATAACAAAATGTAAAATATGGCATNAGCTCCACAAGCGTGCTAAAAC 

SILFYTFIP7SRCSHDL 
LYCFTLLYR7RGVRTIW 
YIVLHFYTV7EVFARF 
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GCCCATCCCAAGTGCATAAGATCATTGATATGACCTCTACGTTGGAGCGT 
I I ' I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I ' M I I I I ' I I I I I I I I I 
CGGGTAGGGTTCACGTATTCTAGTAACTATACTGGAGATGCAAGCTCGCA 

AHPKCIRSLI .PLRWSV 
PIPSA.DH.YDLYVGA 
GPSQVHKI IDMTSTLER 



GTTAACCCGAGATCTAGTTGAGGGGGCATAGGTCTCATTTNTCTACGTGG 
I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I I ' I I I 
CAATTGGGCTCTAGATCAACTCCCCCGTATCCAGAGT AAANGGATGCACC 

LTRDLVEGA VSF7YV 
C.PEI .LRGHRSH7STW 
VNPRSS.GGIGLI7LRG 

AGGTTAAAGATCACCTTTATTNCANCCCTTGTAGATTCTAAACTNGAGGT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I I I I ' I I I ' I ' I I ' I I ' I 
TCCAATTTCTAGTGGAAATAANGTNGGGAACATCTAAGATTTGANCTCCA 

EVKDHLY??PCRF.T?G 
RLKITFI??LVDSKLEV 
G.RSPL??PL.ILN?R 

NGATCTCTNTAGGAGATCGGTCTCCCTTGGAACTCTNTAGGGGTNCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I » 739 

NCTAGAGANATCCTCTAGCCAGAGGGAACCTTGAGANATCCCCANGG 
?SL.EIGLPWNS?GVP 
DL7RRSVSLGTL .G7 
7 I S7GDRSPLEL7RG? 



i Bgl II 
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'BamH I 



GGATCCCAACTTTTAGGAATGGATCTTAAAATTTTAGTTATAAGTTCAAA 



f I I I I I I I I I I I I I I I I I I I I I I I T I I I I I I I I I I I I I I I i I I I I I 1 I I I 



CCTAGGGTTGAAAATCCTTACCT AGAATTTTAAAATCAAT ATTCAAGTTT 

GSQLLGMDLK I LV I SSK 
DPNF.EWILKF .L.VQ 
RIPTFRNGSKNFSYKFK 

GTTAGAAAAATCTTTACCAAGAGCTTTGAGTCCATTGATGACATCCGTGA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 
CAATCTTTTTAGAAATGGTTCTCGAAACTCAGGTAACTACTGTAGGCACT 

LEKSLPRALSPLMTSV 
S.KNLYQEL.VH..HP. 
VRKIFTKSFESIDDIRE 

AACGGTGTACATGTCTCCGATGGACTCACTTGGTTTCATTCGGAAAAGTT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTGCCACATGT ACAGAGGCTACCTGAGTGAACCAAAGTAAGCCTTTTCAA 

KRCTCLRWTHLVSFGKV 
NGVHVSDGLTWFHSEKF 
TVYMSPMDSLGF I RKS 

CGAAAGAGTGCATAAGAATATTGATTTTGGATTCTTTCACTCGGTTGGTG 
I I I I I I I 1 I I ' ' I I I I ' I I I ' I I I I I I I I I I I ' I I I I I ' I I I ' I I I ' I 
GCTTTCTCACGTATTCTTAT AACTAAAACCTAAGAAAGTGAGCCAACCAC 

RKSA EY FWI LSLGWC 

ERVHKN I DFGFFHSVG 
SKECIRILILDSFTRLV 

CCTTCATGAGTGACCTCAAGAGTCCTCCAAATATCAAAAGCCGAATCACA 

-f-H — I ( I I I I I I — I I I I — I I I I I I I I I I I I I — I I I I I I I I I I I I I — I I I I I I I I I I 

GGAAGTACTCACTGGAGTTCTCAGGAGGTTTATAGTTTTCGGCTTAGTGT 

LHE PQESSKYQKPNH 
AFMSDLKSPPNIKSRIT 
PS.VTSRVLQISKAESQ 



KLKCD. IHFCLMHKTGH 
N.NVIEFIFV.CTKQGI 
lEM.LNSFLSNAQNRA 



'EcoRI 



I 
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TCATAGCCTTTGTGTTTAAAGCAAAAACATTCTTCTCCGATTCATCCCAT 

I I I I I I I I I I I I I I I I I I I I I I — !■ I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 

AGTATCGGAAACACAAATTTCGTTTTTGTAAGAAGAGGCTAAGTAGGGTA 

S.PLCLKQKHSSPIHPI 
HSLCV.SKNILLRFIP 
I AFVFKAKTFFSDSSH 

TCGCTCATCGGAAGAGAAAATTTTTGAAATCCATTTTCGACAATAGACCA 
I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I I I I I I I I I I I I I I i I I I I I 
AGCGAGTAGCCTTCTCTTTTAAAAACTTTAGGTAAAAGCTGTTATCTGGT 

RSSEEKIFEIHFRQ.T 
FAHRKRKFLKS I FDNRP 
SLIGRENF.NPFSTIDQ 

i Nco I 

AAGCTCGAAATC'CATGGAAATGAGGAAGATCCTCATATGAGTTTTCCAAT 
I I I I I I I I I I I — I I I I t I t I I I — I — I— I — I I I I — I I I I I I I — I I I I I I I I I I I I I I I 
TTCGAGCTTTAGGTACCTTT ACTCCTTGT AGGAGT ATACTCAAAAGGTTA 

KARNPWK GRSSYEFSN 
KLE I HGNEEDPHMSFP I 
SSKSMXEMRK I L I VFQ 

ACATGTAATTCGACTCATTAAACATAGGTGGATGTGTAATGAAATGACCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 

TGTACATTAAGCTGATGAATTTGTATGCACCTACAGATTACTTTACTGGG 

TCNSTH.T.VDV..NDP 
HV I RL I KHRWMCNEMT 
YM.FDSLNIGGCVMK.P 

TCATGCSCTATCTCTCTTGGGTATTAAACCAAATATGAGAGTGAGCCTTG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I 
AGTACGSGATAGAGAGAACCCAT AATTTGGTTTATACTCTCACTCGGAAC 

HALSLLG I KPNMRVSL 
LM7YLSWVLNQI .E .AL 
SC? ISLGY.TKYESEP C 

CTCTGATACCAATTGTTAGGATCAGAGTGGCACTAAGAGAGGGGGGGAGA 
I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAGACTATGGTTAACAATCCTAGTCTCACCGTGATTCTCTCCCCCCCTCT 

ALIPIVRIRVALREGGS 
L YQLLGSEWH ERGGV 

SDTNC DQSGTKRGGE 
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' EcoR I 

gaat tagtgcagtggattaaaacttataagtttaaaaatg'aattcgtaaa 
cttaatcacgtcacctaattttgaatattcaaatttttacttaagcattt 
elvqwi ktykfknefvn 
n.csglklislkmns. 
• isavd.nl .v.k. irk 

TACGAG AAGATTTCGTTTTAATAGTAACTTGAGTAGATGAAAACCAAAAG 

I i. I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGCTCTTCTAAAGCAAAATTATCATTGAACTCAtCTACtTTTGGttTTC 

TRRFRFNSNLSR. KPK 
IREDFVLIVT.VDENQK 
YEKISF. .LE.MKTSS 

TTAACAGTAGTGTAAATAACAATTTCGGGAAAGTAAGAACTCACACATTC 

I I I I I I I ' I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AATTGTAATCACATTTATTGTTAAAGCCCTTTCATTCTTGAGTGTGTAAG 

VNSSVNNNFGKVRTHTF 
LTVV. ITISGK.ELTHS 
Q.CK.QFRESKNSHI 

AAGGAACATACCAATTTAAAGTGGTTCGGTCAAAATGACCTACATCCACT 

I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I r I I I I I I I I I I I I I I I I I I I 
TTCCTTGTATGGTTAAATTTCACCAAGCCAGTTTTACTGGATGTAGGTGA 

KEHTNLKWFGQNDLHPL 
RNIPI .SGSVKMTYIH 
QGTYQFKVVRSK.PTST 
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TGTGAAGCCTTC TTCGAAGAGGCTCCCAACTTCCACTAGCAAATCACTTT 
|JL1JL-I ' i ' l l'''' l' '' '| iiii|ii ii| iiii| ii 'i |i iii|iiii| 
ACACTTCGGAAGAAGCTTCTCCGAGGGTTGAAGGTGATCGTTTAGTGAAA 

VKPSSKRLPTSTSKSL 
L SLLRRGSQLPLANHF 
CEAFFEEAPNFH.QITL 

GAAGG GGAAGQACAAATACCTCTCTTACNACCTTTTACAATGGTTCATAC 

I ' I M I I I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 
CTTCCCCTTCCTGTTTATGGAGAGAATGNTGGAAAATGttACCAAGTATG 

RGRTNTSLTTFYNGSY 
EGEGQIPLL7PFTMVHT 
KGKDKYLSY7LLQWF I 

TCTT ACAAATTTTCAACGAGAAAGAAGGAGGTGAACATGCAAGCAATTGA 
1 JL i i XA^-!- J. 1 ' ' ' ' I ' ' ' ' I ' ' ' ' I ' I I I I ' I I I I I I I I I I ' I I I I ' I I i 
AGAATGTTTAAAAGTTGCTCTTTCTTCCTCCACTtGTACGTtCGTTAACt 

SYKFSTRKKEVNMQAIE 
LTNFQRERRR.TCKQL 
LLQIFNEKEGGEHASN. 

AAACAAGACTTGCTAAAGACTTTGCTAAGGCTTTTTTTCTCAATCTATTG 
TTTGTTCTGAACGATTTCTGAAACGXTTCCGAAAAAAAGAGTTAGATAAC 

NKTC.RLC.GFFSQSI 
KTRLAKDFAKAFFLNLL 
KQDLLRTLLRLFFSIYC 

CTTCTCAAAAGTTGTATTCTCTGCTGAGAATTGAGGGGTATTTATAGACC 

GAAGAGTTTTCAACATAAGAGACGACTCTTAACTCCCCATAAATATCTGG 
ASQKLYSLLRIEGYL.T 
LLKSCILC.ELRGIYRP 
FSKVVFSAEN.GVFID 

CCAAGAGGATTTAAATTTGGGCTCCAAATTTCGAATGCTCTTGGGTTCCC 

I I ' I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGTTCTCCTAAATTTAAACCCGAGGTTTAAAGCTTACGAGAACCCAAGGG 
PRGFKFGLQISNALGFP 
QEDLNLGSKFRMLLGS 
PKRI IWAPNFECSWVP 
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GAGGTTGCCGGTGCCACCGCCTGTCAGTGTTTGACACTGGACAGTGTACT 

I I I I I I I I I I — I I I I I I I I I I I — I — I — I I I I — I I I I — I I I I I I I — I— I — I I I I I I I I I I 

CTCCAACGGCCACGGTGGCGGACAGTCACAAACTGTGACCTGTCACATGA 

RLPVPPPVSV.HWTVY 
RGCRCHRLSVFDTGQCT 
EVAGATACQCLTLDSVL 

AGCGGTGCCGCCGCCGGACCTCTCGGGTGTTGGGCGGTGCCACCGCCTAG 

I I I I I I I I I I — I I I r I I I I I I — I — I — I — I I I I I I I I — i-H— 1 — I I I I I I — I I I I I I I I I I 

TCCCCACGGCGGCGGCCTGGAGAGCCCACAACCCGCCACGGTGGCGGATC 

RCHRRTSRVLGGATA 
SGATAGP LGCWAVPPPR 
AVPPPDLSGVGRCHRL 

ACTTTTTCAGCTCACTGGTTGGATTCCAAACTTGACCCAAACCAGTCCGA 

I I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I I I I I I I I I J I I I I I I I I I 

TGAAAAAGTCGAGTGACCAACCT AAGGTTTGAACTGGGTTTGGTCAGGCT 

TFSAHWLDSKLDPNQSE 
LFQLTGWIPNLTQTSP 
DFFSSLVGFQT PKPVR 

ACTCGGGTCCAATTGACCCGTAACCGGATTATAGGATTAACCCTTAATCC 
I I I I I I I I I I — I I I I I I I I I I I I — I — I I I I I I I I I I I I I I I I I — I I I I I I I I I [ 
TGAGCCCAGGTTAACTGGGCATTGGCCTAATATCCTAATTGGGAATTAGG 

LGSN.PVTGL.D.PLI 
NSGSIDP.PDYRINP.S 
TRVQLTRNRI IGLTLNP 

TAACCCTAATTATATGCAAACTACGCAACTGAAAATATAGTCGTAAGCAA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATT GGGATTAA TATA CGTTTGATGCGTTGACTTTTATATCAGGATTCGTT 

LTLI ICKLRN .KYSPKQ 

P.LYANYATENIVLSK 

NPNYMQTTQLKI .S.A 

GTTTTTAACCGGCAAACGTCGAGTCTTCTTCCGGCGATCTTTCGGCAGAC 
I I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I I I I I I I I I 1 I I I I I I I I I 
CAAAAATTGGCCGTTTGCAGCTCAGAAGAAGGCCGCTGGAAAGCCGTCTG 

VFNRQTSSLLPAIFRQT 

FLTGKRRVFFRRSFGR 

SF PANVESSSGDLSAD 
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TTCTGATATACCTTTGGATTTCTTCTAGCGGACTCCTAGTAGGGTCCCGA 

I I ' I I I I I M t I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGACTATAAGGAAACCTAAAGAAGATCGCCTGAGGATCATCCCAGGGCT 

SDIPLDFF.RTPSRVP 
LLIYLWISSSGLLVGSR 
F YTFGFLLADS, .GPD 

TCTTGTGGCGAGTTTAGCGAGTAGCCGAACCTTCTCGGTGATCTCCGCAA 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGAACACCGCTCAAATCGCTCATCGGCTTGGAAGAGCCACTAGAGACGTT 

ILWRV.RVAEPSR.SPQ 
SCGEFSE PNLLGDLRK 
LVASLASSRTFSVISA 

ACCGCCGATGATCTCTTCGGCAGACTTTCGAAAACTTCGACAAGTGCCCG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I [ I I I I I I I I I I I I I I I I 

TGGCGGCTACTAGAGAAGCCGTCTGAAAGCTTTTGAAGCTGTTCAGGGGC 
TADDLFGRLSKTSTSPR 
PPM I SSADFRKLRQVP 
NRR.SLRQTFENFDKSP 

ATTTCTTCTCGGTTGGTTCCGACAGCATCTCTAACGAAACTTCGGACACC 

I I I I I I I I ' I I' I I I 1 I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TAAAGAAGAGCCAACCAAGGCTGTCGTAGAGATTGCTTTGAAGCCTGTGG 

FLLGWFRQHL.RNFGL 
DFFSVGSDS I SNETSDS 
I SSRXVPTASLTKLRTP 

TTGAATGTCCATCGAACTTGACTCCGGTAGGCTTGCTTTATATTTTCAGG 

I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AACTTACAGGTAGCTTGAACTGAGGCCATCCGAACGAAATATAAAAGTCC 
LECPSNLTPVGLLY I FR 
LNVHRT.LR.ACFIFSG 
MS I ELDSGRLALYFQ 

CTATCATAGTTAATCCTACATACTTAACTCAATAATATGGATTAGATTAA 

M ' I I ' I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GATAGTATCAATTAGGATGTATGAATTGAGTTATTATACCTAATCTAATT 
LS .LILHT.LNNMD.IN 
YHS.SYILNSIIWIRL 
AIIVNPTYLTQ.YGLD. 
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TTAACCCATCAATTGATTTCATCATCAAAATTCGACATTCAACAAACATC 
I ' I I I ' I I ' I I I I I I I I I ' I I I I I I I I I I I I r I I I I I I I I I I I I I I I I I I 
AATTGGGTAGTTAACTAAAGTAGTAGTTTTAAGCTGTAAGTTGTTTGTAG 

PIN.FHHQNSTFNKH 
INPSIDFI IKIRHSTNI 
LTHQL ISSSKFDIQQTS 



CGTACTCAATAACCCATCAGGCilATAjGTTACGTGACTATCTACTGTGATC 

I I I I I I I ' I I 'I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



GCATGAGTTATTGGGTAGTCCGATATCAATGCACTGATAGATGACACTAG 
PYSITHQAIVT.LSTVI 
RTQ.PIRL.LRDYLL.S 
VLNNPSGYSYVT I YCD 

CGTACGTGAAGTTAGCGAGTCATGATCCAGGTCGTGTCACTTATTGGCCG 
I I I I I I I I I I I I ' I I I I I I I I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCATGCACTTCAATCGCTCAGTACTAGGTCCAGCACAGTGAATAACCGGC 

RT.S.RVMIQVVSLIGR 
VREVSES SRSCHLLA 
PYVKLASHDPGRVTYWP 

AACACGTATCCCTTATCCAAATCCAGTCTTCTCAACTCTTCTAGCCTACC 

I t I I I I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I 

TTGTACATAGGGAATAGGTTTAGGTCAGAAGAGTTGAGAAGATCGGATGG 
TRIPYPNPVFSTLLAY 

ehvsliqiqssqlf.pt 
ntyplskssllnssslp 

•EcoRI 

cgtctctttttttattacttttgaaag'aattcaaatcaaaacagatacaa 

I I I I I I I ' I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 

gcagagaaaaaaataatgaaaactttcttaagtttagttttgtctatgtt 
pslfllllkefkskqiq 
rlffyyf knsnqnryk 
vsffitferiqiktdt 

aataacacggtgagacactgtgacatgctagtctctggaaagcattaatt 

I I I I I I I I I I I I I I I I I I I I r I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 

ttattgtgccactctgtgacactgtacgatcagagacctttcgtaattaa 
nntvrhcdmlvsgkh f 

ITR.DTVTC.SLESIN 
K.HGETL.HASLWKALI 
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CGCGCATCCACAGACGTCGTCAGCTTCATCACCCACTTTTTCCTACATAA 

I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCGCGTAGGTGTCTGCAGCAGTCGAAGTAGTGGGTGAAAAAGGATGTATT 

AHPQTSSASSPTFSY I 
SRIHRRRQLHHPLFPT. 
RASTDVVSF I THFFLHN 

i Hind III 

CCATGTCGCATGGCTTTGTTG^liACAGACCACCACA'AGCTTGCCTTTGG 



1, ' I ' I 'I I I I I I I I I I I I I I I iT-r I I I I I I I I I I I I I I I I I I I I I I I I I 



GGTACAGCGTACCGAAACAACTACTGTCTGGTGGTGTTCGAACGGAAACC 
TMSHGFVDDRPPQACLW 
PCRMALLMTDHHKLAFG 
HVAWLC. .QTTTSLPL 



I I I M I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I , 

aacacggattgtctctctctctctgtctggctatcggaggagtaagtgaI 
lcltererqtdsllihy 
ca.qrerdrpiassft 
vvpnreretdr. pphsl 



tgigcgatccgatcgccagcttcgctgctgttatttgcgttcctgIatgIctt 

"TT ' ' I ' I I ' I ' I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I i'tt-t i I I 



accgctaggctagcggtcgaagcgacgacaataaacgcaaggactacgaa 

gdpiasfaavicvpda 
mairspaslllfaflml 
wrsdrqlrccylrs cl 

jPstI 

gcgctcacgggaagactgca'ggcccggcgcaggtcatgcattggcgtcta 
r I I I I ' I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I 
cgcgagtgcccttctgacgtccgggccgcgtcgagtacgtaaccgcagat 

cahgktagpaqlmhwrl 

altgrlqarrsscigvy 

rsredcrpgaahalas 

•Hind ill 

ctggggacaaaacaccgacgaggga'agcttagcagatgcttgtgccacag 

M I' I I ' I ' I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

gacccctgttttgtggctgctcccttcgaatcgtctacgaacacggtgtc 
lgtkhrrgklsrclchr 
wgqntdegsladacat 
tgdktptrea qmlvpq 
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GCAACTACGAATACGTGAACATCGCCACCCTTTTCAAGTTTGGCATGGGC 

I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I 




QLR I REHRHPFQVWHG 
GNYEYVN I ATLFKFGMG 
ATTNT . TSPPFSSLAWA 



PNSRDQPRRPL PSEQR 
QTPEINLAGHCDPRNNG 
DLQRSTSPATVTLGTT 



LRA LEQRNPVLPGAWRQ 
CARLSSE I QSCQERGV 
AARA AAKSSPARSVAS 

AGGTGATGCTCTCCATCGGAGGTGGCGGGTCTTATGGCCTGAGTTCCACC 
I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I' I I 
TCCACTACGAGAGGTAGCCTCCACCGCCCAGAATACCGGACTCAAGGTGG 

GDALHRRWRVLWPEFH 

KVMLS I GGGGSYGLSST 

R .CSPSEVAGLMA.VPP 
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GAAGACGCCAAGGACGTAGCGTCATACCTCTGGCACAGTTTCTTGGGTGG 

I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTTCTGCGGTTCCTGCATCGCAGTATGGAGACCGTGTCAAAGAACCCACC 
RRRQGRSV I PLAQFLGW 
EDAKDVASYLWHSFLGG 
KTPRT RHTSGTVSWV 

jXhol 

ttctgctgctcgctac'tcgagacccctcggggatgcggttctggatggca 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

aagacgacgagcgatgagctgtggggagcccctacgccaagacctaccgt 
fccslletprgcgsgwh 
saarysrplgdavldg 
vlllatrdpsgmrfwma 

tagacttcaagatcgccggagggagcacagaacactatgatgaacttgcc 

I I I I I ' I I ■ I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATCTGAAGTTGTAGCGGCCTCCCTCGTGTCTTGTGATACTACTTGAACGG 

RLQHRREHRTL TCR 
IDFNIAGSTEHYDELAA 
TSTSPEAQNTMMNLPL 

GCTTTCCTCAAGGCCTACAACGAGCAGGAGGCCGGAACGAAGAAAGTTCA 
I I I I I ' I I ' I I ' I I I ' I I ' I ' I I ' I I I I I M i I I I I I I I I I I I I I I I I I I 
CGAAAGGAGTTCCGGATGTTGCTCGTCCTCCGGCCTTGCTTCTTTCAAGT 

FPQGLQRAGGRNEEESS 
FLKAYNEQEAGTKKKVH 
SSRPTTSRRPERRRKF 

CTTGAGTGCTCGTCCGCAGTGTCCTTTCCCGGATTACTGGCTTGGCAACG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAACTCACGAGCAGGCGTCACAGGAAAGGGCCTAATGACCGAACCGTTGC 
LECSSAVSFPGLLAWQR 
LSARPQCPFPDYWLGN 
T.VLVRSVLSRITGLAT 

jBglll 

CACTCAGAACAGATCTCTTCGACTTCGTGTGGGTGCAGTTCTTCAACAAC 

I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 

GTGAGTCTTGTCTAGAGAAGCTGAAGCACACCCACGTCAAGAAGTTGTTG 

TQNRSLRLRVGAVLQQ 
ALRTDLFDFVWVQFFNN 
HSEQISSTSCGCSSSTT 
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CCTTCGTGCCATTTCTCCCAGAACGCTATCAATCTTGCAAATGCGTTCAA 

I I I I I I I I I I I I I 1 I I I I I I I I I 1 I I t I I I I I I I I I I I I j I I I I I I I I I I 

GGAAGCACGGTAAAGAGGGTCTTGCGATAGTTAGAACGTTTACGCAAGTT 
PFVPFLPERYQSCKCVQ 
PSCHFSQNA I NLANAFN 
LRAISPRTLSILQMRS 

CAATTGGGTCATGTCCATCCCTGCGCAAAAGCTGTTCCTTGGGCTTCCTG 

i I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I 

GTTAACCCAGTACAGGTAGGGACGCGTTTTCGACAAGGAACCCGAAGGAC 

QLGHVHPCAKAVPWASC 
NWVMSIPAQKLFLGLP 
T IGSCPSLRKSCSLGFL 

CTGCTCCTGAGGCTGCTCCAACTGGTGGCTAGATTCCACCCCATGATCTC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GACGAGGACTCCGACGAGGTTGACCACCGATGTAAGGTGGGGTACTAGAG 

CS.GCSNWWLHSTP.S 
AAPEAAPTGGYIPPHDL 
LLLRLLQLVATFHPMIS 

ATATCTAAAGTTCTTCCGATCCTAAAGGATTCCGACAAGTACGCAGGAAT 

I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TATAGATTTCAAGAAGGCTAGGATTTCCTAAGGCTGTTCATGCGTCCTTA 

HI.SSSDPKGFRQVRRN 
ISKVLPILKDSDKYAGI 
YLKFFRS.RIPTSTQE 

CATGCTGTGGACTAGATACCACGACAGAAACTCCGGCTACAGTTCTCAAG 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I t I I I I I I 1 I I I I I I I I I I I I I I I I 

GTACGACACCTGATCTATGGTGCTGTCTTTGAGGCCGATGTCAAGAGTTC 
HAVD. IPRQKLRLQFSS 
MLWTRYHDRNSGYSSQ 
SCCGLDTTTETPATVLK 

TCAAGTCCCACGTGTGTCCAGCGCGTCGGTTCTCCAACATCTTATCTATG 

I I I I I I I I I I I 1 I I I I I I I I I I I 1 I I I I I I I I I 1 I I I I I I I I I I I I I I I I 

AGTTCAGGGTGGACACAGGTGGCGCAGCCAAGAGGTTGTAGAATAGATAC 

QVPRVSSASVLQHL I Y 
VKSHVCPARRFSN I LSM 
SSPTCVQRVGSPTSYLC 
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CCGGTGAAGTC TTCCAAGTAAACCTGAACGGCGTAGATGATCGGTGGTCG 
JlAAJLlJ>-J--i.J. l l ' ' ' l ''' 'l' '' ' 'i'i|i i i i |iiii|iiii | i i ii| 
GGCCACTTCAGAAGGTTCATTTGGACTTGCCGCAtCTACtAGCCACCAGC 

AGEVFQVNL N GVDDRWS 

PVKSSK.T.TA.MIGGR 

R SLPSKPERRR.SVV 

AAAACT CCGATCATCATGGGTCCCCATCCGTATCCGTGCGTTGCTACGTT 

M I I I ' I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 

TTTTGAGGCTAGTAGTACCCAGGGGTAGGCATAGGCACGCAACGAtGCAA 
KTP I IMGPHPYPCVATL 
KLRSSWVPIRIRALLR 
ENSDHHGSPSVSVRCYV 

ATGGTGTTTCCCTTGT ATGTTGGTCTTTTCAATAATATAATAAQGGGTTA 

I I ' I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TACCACAAAGGGAACATACAACCAGAAAAGTTAttATAttATTCCCCAAt 

WCFPCMLVFSII GV 
YGVSLVCWSFQ.YNKGL 
MVFPLYVGLFNNI IRG. 

GTT TTACGTTTCCATATTTTCCATGTTCGAAAACAGTATATTTGCTGCCC 

I I ' I I I I ' I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAAAATGCAAAGGTATAAAAGGTACAAGCTTTTGTCATATAAACGACGGG 

SFTFPYFPCSKTVYLLP 
VLRFHIFHVRKQYICCP 
FYVSIFSMFENSIFAA 
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CTTCCAAATTTGAAAAAGATAAAATAAATATATAACTAAAAATATCCTCT 
I I I I I ' I I I I I ' I I I I I I I I I I I I I I I I ' I I I ' I I I I I I I I I ■ I I I I I t I 
GAAGGTTTAAACTTTTTCTATTTTATTTATATATTGATTTTTATAGGAGA 

LPNLKKIK. lYN.KYPL 
FQI.KR.NKYITKNIL 
PSKFEKDKINI.LKISS 

TTTTTTTTTCTTTCGACAAATATATAACTCTTAACTTTCCCAATTGTTTA 
I I I I I I I I I I I I I I I I I ' I I I I I I I ' I I I I I I ' I I I I' I I I I I I I I' I I I 
AAAAAAAAAGAAAGCTGTTTATATATTGAGAATTGAAGGGGTTAACAAAT 

FFFFRQIYNS .LSQLF 
FFFSFDKY I TLNFPNCL 
FFFLSTNI .LLTFPIV. 

AGCAAAAGATATAAATCCTCTTCCACACAAAAGACGAATCCATGATTGCT 
I I I I I I I I ' I I I I I I I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCGTTTTCTATATTTAGGAGAAGGTGTGTTTTCTGCTTAGGTACTAACGA 

KQKI ILFHTKDESMIA 
SKRYKSSSTQKTNP.LL 
AKDINPLPHKRRIHDC 

GGATTGCTGTCTACTGGTGCCGAAATGGCGACGAGAGAAGCTTGTGCTAC 
I I I I M ' I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCTAACGACAGATGACCACGGCTTTACCGCTGCTCTCTTCGAACACGATG 

GLLSTGAEMATREACAT 
DCCLLVPKWRREKLVL 
Wl AVYWCRNGDERSLCY 

CTGCAATTACAAGTTCGTCAACATTGTCTTCCTTGCCATGTTTGGTGACG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I t I I I I I I I 

GACGTTAATGTTCAAGCAGTTGT AACAGAAGGAACGGTACAAACCACTGC 

CNYKFVN I VFLAMFGD 
PAITSSSTLSSLPCLVT 
LQLQVRQHCLPCHVW. R 

CCATACTCCCGTGATCAGGACACACCTCTGGAACAGTTTCTTGGGAAGTT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGTATGAGGGCACTAGTCCTGTGTGGAGACCTTGTCAAAGAACCCTTCAA 
AILP.SGHTSGTVSWEV 
PYSRDQDTPLEQFLGKL 
HTPVIRTHLWNSFLGS 
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AATCTTCTTCTCGGCTCCTCGGCGACCAATCTTGTGAGGTTCTTCTCCTG 

H — I I I I I I I I I — I I I I I I I I I I I I I I I I I — h-t — I — I — I I I I I I I I I I I I I j I I I I I 

TTAGAAGAAGAGCCGAGGAGCCGCTGGTTAGAACACTCCAAGAAGAGGAC 
NLLLGSSATNLVRFFS 
IFFSAPRRPIL.GSSP 
SSSRLLGDQSCEVLLL 

AATGGTGTCCACTTCGACATCGAAGGTCTACCTGAGCGCANATCCACAGT 
I I I I I ' ' I ' I I ' I ' I ' ' ' ' I ' I 1 ' ' I I I I — I I I 1 ' I ' I I ' ' ' ' I I ' t I I 
TT ACCACAGATGAAGCTGTAGCTTCCAGATGGACTCGCGTNTAGGTGTCA 

MVSTSTSKVYLSA7PQ 
EWCPLRHRRST.A7IHS 
NGVHFDIEGLPER7STV 

TCCGACTACGTGTGGGTGCAGTTCTACTACACAGGCAACTCGCAGATGCC 

— I I I I I I I I — I I I I I I I I I I I I I I I I I I — h— |— I — I I I I I I I — I I I I I I I I I I — H 

AGGCTGATGCACACCCACGTCAAGATGATGTGTCCGTTGAGCGT.CTACGG 
FRLRVGAVLLHRQLADA 
S DYVWVQFYY TGNSQMP 
PTTCGCSSTTQATRRC 

CGGTAACAATGGGTTCTCCATCCTGCATGGAAGGTGTTCCCTGGACTTCC 
I I I I I I I I I I I ' I ' I I 'I I I I I I I I ' t I I I I I ' I I I I I I I I I ' I I I I I I I 
GCCATTGTTACCCAAGAGGTAGGACGTACCTTCCACAAGGGACCTGAAGG 

R QWVLHPAWKVFPGLP 

GNNGFS I LHGRCSLDF 

PVTMGSPSCMEGVPWTS 

jSacI jSpel 

TGCTGCTCCTCAGGCTGCTGGAAGGAGCTCCATTCCACTAGTGATCTTAC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I I I I I I I I I I I 
ACGACGAGGAGTCCGACGACCTTCCTCGAGGTAAGGTGATCACTAGAATG 

AAPQAAGRSSIPLVIL 
LLLLRLLEGAPFH. .SY 
CCSSGCWKELHSTSDLT 

ACGTGTCTTATCATCAAGAATTATAGCAAGTACCGAGGGATTATTAAAAT 
I I I I I 1 I I I I I I I I I I' 'I I I I 'I I ' I I I I I ' ' I I I I I I I ' I' I I I I I I I 
TGCACAGAATAGTAGTTCTTAATATCGTTCATGGCTCCCTAATAATTTTA 

HVSYHQEL QVPRDY, N 

TCLIIKNYSKYRGIIKi 
RVLSSRI lASTEGLLK 
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AAAAAAAAAGGGAAGAATGGGAATTAGAATTAAAACTGAAACCGGCCATG 



I I I I I I I I I I I I I I I I I I I I I I I I i I — I I I I I 1 I I I r I I I — I I I I I I I I I I I 



TTTTTTTTTCCCTTCTTACCCTTAATCTTAATTTTGACTTTGGCCGGTAC 

KKKGKNGN.N.N.NRP. 
KKKGRMG I R I KTETGH 
K KKREEWELELKLKPAM 

AAGAACGTTTTCGAGTGAAGACAACGACAGTATGAGACGGTAGTTTGCTA 
I I I I I I I I I I I I I I I I I I I I ' I I ' I I I I ' I I I ' I I I I I I I I I I I I I I I I I 
TTCTTGCAAAAGCTCACTTCTGTTGCTGTCATACTCTGCCATCAAACGAT 

RTFRVKTNDSMRR FA 
EERFE.RQTTV.DGSLL 
KNVSSEDKRQYETVVCY 

TGGACATGGATCGTTCCCAAAGCAGTCCAAGTCTTTATGAACCGGTCTAT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ACCTGTACCTAGCAAGGGTTTCGTCAGGTTCAGAAATACTTGGCCAGATA 

MDMDRSQSSPSLYEPVY 
WTWI VPKAVQVFMNRS I 
GHGSFPKQSKSL TGL 

CGGTTCAGCCTTCAAGAACCGCGAGGATAACCGGCCCAAGAGAAACAACA 
I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCCAAGTCGGAAGTTCTTGGCGCTCCTATTGGCCGGGTTCTCTTTGTTGT 

RFSLQEPRG PAQEKQQ 
GSAFKNREDNRPKRNN 
SVQPSRTAR I TGPRETT 
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AATTGTGGTGAGCTTTTANTATAAACCGAACGGTGCCGTCCGTCAGATGT 
I I I ' I I I I I I I I I I I I I I ' I 'I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTAACACCACTCGAAAATNATATTTGGCTTGCCACGGCAGGCAGTCTACA 

I VVSF7YKPNGAVRQM 
KLW. AF? ! NRTVPSVRC 
NCGELL? TERCRPSDV 

j Bgl II 

taaatggacggcggata'gatctccagagtaaatctgaggaaaatcgttcc 

r ' I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

atttacctgccgcctatctagaggtctcatttagactccttttagcaagg 

lngrr idlqs kseenrs 
mdgg.isrvnlrkivp 
kwtadrspe. i.gksf 

ggcccccctaccacgacccacgcgatccgtcctctcccccaccccctaca 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ccggggggatggtgctgggtgcgctaggcaggagagggggtgggggatgt 

GPPTTTHA I RPLPHPLH 
APLPRPTRSVLSPTPY 

rppyhdprdpssppppt 



EcoR I j 

CCTTTTTCTTCTTCCGCTCCTGCGATCGGTTATTTGATTTTGTGTATGAT 
i i i i i 'i i ' i i i i i i ' i i i i i i i i i i i i i i i i i i i r i i i i i i i i i i i i i i 
GGAAAAAGAAGAAGGCGAGGACGCTAGCCAATAAACTAAAACACATACTA 

lfllpllrsvi .fcv. 
tffffrscdrlfdfvyd 

PFSSSAPA IGYL I LCMI 

ATCCAATTTCTTTTCTGGAGTGGTATCCTATTCTAATTTCTTAGATTGTT 
i i i ' i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 i i ' i i i i i i i i i m 
TAGGTTAAAGAAAAGACCTCACCATAGGATAAGATTAAAGAATCTAACAA 

ypisflewypilis. iv 
iqflfwsgilf.flrll 
snffsgvvsysnfldc 

GTATTGAACCATCAGTTTTGGTTTAAGCGCATGATGGCGGAGAGTTTCGG 
i i i ' i 'i i i i ' i ' i i i i i i i i i i i i i i i i i i i i i i i ' i i i 'i i i j l i i i i 
CCTAACTTGGTAGTCAAAACCAAATTCGCGTACTACCGCCTCTCAAAGCC 

VLNHQFWFKRMMAESFG 

Y.TISFGLSA.WRRVS 

CIEPSVLV.AHDGGEFR 
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GAGATGGGAGTCAGATCCCTTGTTTTCTGCTGCCGAAGTGGTGCAAGATT 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTCTACCCTCAGTCT AGGGAAC A AAAGACGACGGCTTCACCACGTTCAAA 

RWESDPLFSAAEVVQD 
GDGSQIPCFLLPKWCKI 
EMGVRSLVFCCRSGARF 

CGGCCGATAGGTTTTTTCTCTCATTTTAAGCTCAATTATGCGGTCATTCT 
I I I I I I I I I I I I I I I I I I I I I 1 I t I I I I I I ' I ' I I I I I I I I I I I I I I 'I I 
GCCGGCTATCCAAAAAAGAGAGTAAAATTCGAGTTAATACGCCAGTAAGA 

SADRFFLSF .AQLCGHS 
RPIGFFSHFKLNYAVIL 
GR.VFSLIL SSIMRSF 

TGTTAGGCTTTGGAGAATTTGCTCTATTTCGAAAGAAATTGCTGCTTTCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACAATCCGAAACCTCTTAAACGAGAT AAAGCTTTCTTTAACGACGAAAGA 

C.ALENLLYFERNCCFL 
VRLWRICSISKEIAAF 
LLGFGEFAL FRKKLLLS 

AGTTTTGATTAGTCCCTATAAAATTTGCTTTCGGTTCTGAATATCCGAGA 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCAAAACTAATCAGGGATATTTTAAACGAAAGCCAAGACTTATAGGCTCT 

VLISPYKICFRF. ISE 
F.LVPIKFAFGSEYPR 
SFD.SL.NLLSVLNIRE 

i EcoR I 

ATGTCGTATCGTCAATGACGATTCTTTTTTAGAATTCTAATACTTTGTCC 

I I I I I I I I I I t I I I I I I I I I I I I I — I— I— I — I I I I I I I I I I I I I I I 1 I I I I I I I 

TACAGCATAGCAGTTACTGCT AAGAAAAAATCTTAAGATTATGAAACAGG 
NVVSSMTILF.NSNTLS 
MSYRQ.RFFFRILILCP 
CRIVNDDSFLEF.YFV 

TGTTTTCTGTGATTTAATGGAGAAAATATTGTTCCTTTTAGTGATCTATG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACAAAAGACACTAAATTACCTCTTTTATAACAAGGAAAATCACTAGATAC 

CFL.FNGENIVPFSDLC 
VFCDLMEK I LFLLVIY 
LFSVI .WRKYCSF. .SM 
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CTCTCC CGACCATTAGGATGAGGGTTGAAGGTGAAAATACTTTCTGGTAA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAGAGGGCTCGTAATCCTACTCCCAACTTCCACTTTTATGAAAGACCAtt 

SPDH.DEG.R. KYFLV 
ALPT I RMRVEGENTFW. 
LSRPLG. GLKVK I LSGN 

TTTTCCTCT CTAAATTCTTCCAAACACGACACAAGTATAATTATAGACCA 

I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAAAGGAGAGATTTAAGAAGGTTTGTGCTGTGTTCATATtAATAtCTGGT 
IFLSKFFQTRHKYNYRP 
FSSLNSSKHDTSI I IDQ 
FPL.ILPNTTQV.L.T 

AGATTGATTCTTCTTATGCACCGATTCTCACTTCCGTTCCCTCTGTGTTA 

I I I ' I ' I I I I I ' I I 1 I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I 1 I I I r I I 

TCTAACTAAGAAGAATACGTGGCTAAGAGTGAAGGGAAGGGAGACACAAT 
RL I LLMHRFSLPFPLCY 
D. FFLCTDSHFPSLCV 
KIDSSYAPILTSLPSVL 

TGGTTA TCGTTGTTACTGATGGTTGCTTAACTCATGGGGTAGCGCCTGGG 

I I ' I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACCAATAGCAACAATGACTACCAACGAATTGAGTACCCCATCGCGGACCC 

GYRCY .WLLNSWGSAW 
MVI VVTDGCLTHGVAPG 
WLSLLLMVA.LMG.RLG 



FIG. 17F-3 



;Pstl 

; • Sal I 

TGATCCGTTGACCTGCAGGTCGAC 
I I I I I ' I I ' I I ' I I I I I I I I I I ! ■ I — 4924 
ACTAGGCAACTGGACGTCCAGCTG 
V I R . P A G R 
S V D L Q V D 
DPLTCRST 
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' Hind III 

; Xho 1 i Sal I rj- START HERE 

tcactggtacggggcccccctcgagg'tcgacggtatcgata'agctttgat 

I I I I — I I I I I I I I I I I I I I 1 I I I I I — I I I I I I I I I I I I I I I I I I I I I I I I I I 

agtgaccatgccccggggggagctccagctgccat agctattcgaaacta 
slvrgpprgrryr .ali 
hwygaplevdg i dkl 

LTGTGPPSRSTVSISFD 

TTGATCTCTCTTCTTCAATCTCTCTCTCTCTCTCTCTCTCTCTCTGTATG 

i i i i i i i i ( i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 

AACTAGAGAGAAGAAGTTAGAGAGAGAGAGAGAGAGAGAGAGAGACATAC 

SSLNLSLSLSLSLSLY 
SLLSISLSLSLSLSLCM 
LFSQSLSLSLSLSLSVC 

TCTTTAAATATGGTTGTAATGCTGAATTGCTATGTTTATCTTGGCCAAAC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ] I I 
AGAAATTTATACCAACATTACGACTTAACGATACAAATAGAACCGGTTTG 

VFKYGCNAELLCLSWPN 
SLNMVVMLNCYVYLGQT 
L.IWL.C.IAMFILAK 

TGTGTCCATCTTTGAGCAGATAAATCTGGCGATAATGTTCTTTTTACTGA 
I I I I I I I I i I I I I I I I I I I I I I' I I ' I I I I ' I I I I ' I I ' I I I I ' I I I ' I I 
ACCCAGGTAGAAACTCGTCTATTTAGACCGCTATT ACAAGAAAAATGACT 

CVHL ADKSGDNVLFTE 
VSIFEQINLAIMFFLL 
LCPSLSR. IWR.CSFY. 

; PstI 

aagcactgca'ggatgagggcctgaaatcacatcggacgcccactgggtca 

I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ttcgtgacgtcctactcccggactttagtgtagcctgcgggtgacccagt 

stag.gpeitsdahwv 
kalqdeglkshrtptgs 
khcrmra.nhigrplgh 

jNcol 

tgatgatatggactcctccacagcgagcagc'catgggatgtgagatccac 

I I I I I I I I I I I I I I I I I I t I I I ' I I ' 'I I I 'I I I I ' I I ' I ' I 'I I I I ' I I 

actactatacctgaggaggtgtcgctcgtcggtaccct acactctaggtg 

MMIWTPPQRAAMGCE I H 
YGLLHSEQPWDVRST 
DDMDSSTASSHGM.DP 
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ATAGCAGCGTAGATAAGGGAAGCCCGCAACACTAGGCTGTTGTTGTTCCA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I ' I I I I I I I I I I I I I I I I I I 
TATCGTCGCATCTATTCCCTTCGGGCGTTGTGATCCGACAACAACAAGGT 

lAA. IREARNTRLLLFQ 

QRR GKPATLGCCCS 

HSSVDKGSPQH AVVVP 

GTAAAGATCGAAAGGTCAGGCGACAGTGACGATCGACTTTTTCGAGCATG 

I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CATTTCTAGCTTTCCAGTCCGCTGTCACTGCTAGCTGAAAAAGCTCGTAC 

RSKGQATVT I DFFEH 
SKDRKVRRQ RSTFSSM 
VKIERSGDSDDRLFRA. 

ATGACAACGACGACCTGCTCCTGCAATATCCGTCCCCTACCGTAGAGTGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ) I I I I I I I I I I I I I 
TACTGTTGCTGCTGGACGAGGACGTTATAGGCAGGGGATGGCATCTCACC 

DDNDDLLLQYPSPTVEW 
MTTTTCSCNIRPLP.SG 
QRRPAPA I SVPYRRV 

GAATAAATGGGTTTGTAGTTGCACTATTTCTCGCAGGAATTAATTGAAAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 1 I I I I I I I I 1 I I I I I I I I 

CTTATTTACCCAAACATCAACGTGATAAAGAGCGTCCTTAATTAACTTTC 

E.MGL.LHYFSQELIES 
NKWVCSCT I SRRN LK 
GINGFVVALFLAGIN.K 

CCCTGCAAATTGCTGTTTCTCTTTCCTTATATTAAACCTTCCTCCTGTTA 

I I I — I I I I — HH — I — I I I I I I I I I I I ' I I I ' I ' I — I I I I I I — I — I I I — I I I I I I I I I I I 

GGGACGTTTAACGACAAAGAGAAAGGAATATAATTTGGAAGGAGGACAAT 

PANCCFSFL I LNLPPV 
ALQIAVSLSLY.TFLLL 
PCKLLFLFPY I KPSSCY 

; BamH I ; Bgi II 

cattaaaattgcatgttaagacatttctgtatg'gatccgaacatga'gatc 

I I I I I I I I I I I I I I I I 1 I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 

gtaattttaacgtacaattctgtaaagacatacctaggcttgtactctag 

tlklhvktflygsehei 
h ncmlrhfcmdpnmrs 
ikiac.disvwirt.d 
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TATCATTGAAGTAATGGGTAGGATTTACATTATCATCATCATCATCATCT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATAGTAACTTCATTACCCATCCTAAATGTAATAGTAGTAGTAGTAGTAGA 

YH.SNG.DLHYHHHHHL 
IIEVMGRIYIIIIIII 
LSLK.WVGFTLSSSSSS 

i Nco I 

c'cATGGGTTTGGATCTAATTAGACCGAAAACCTCATTTAAAATCCAACCC 

I I I I I I I I I I I I I I I I I I I I I I I I — |— t— I — I I I I I I I I I I I I I I I I I I I I I I I 

GGTACCCAAACCTAGATTAATCTGGCTTTTGGAGTAAATTTTAGGTTGGG 

HGFGSN. TENL I NPT 
SMGLDL I RPKTSFK IQP 
PWVWI LDRKPHLKSNP 

CAATATTGGCTTGACTTGCTCCATCTCCAAGAAAAATACAACAAGAACAA 
I I I I I I I I I I I I I I I I I I I I I I I i I I 1 I I I I I I I I I I I I I I I I I I I I I I I 
GTTATAACCGAACTGAACGAGGTAGAGGTTCTTTTTATGTTGTTCTTGTT 

PILA.LAPSPRKIQQEQ 
QYWLDLLHLQEKYNKNN 
N IGLTCS I SKKNTTRT 

CAAAAATTTAGGATGCACATTGAATTGATTTGGTCACTATGAGAGAATCA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTTTTTAAATCCTACGTGTAACTTAACTAAACCAGTGATACTCTCTTAGT 

QKFRMHIELIWSL.ENH 
KNLGCTLN FGHYERI 
TKI .DAH. IDLVTMRES 
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TGGATT AAAAATATTAAAATAAAAAATAAATCATAATCATCTACTCACTC 

I I I M I I I ' I I ' I ' I ' I ' I ' I I I I I I I I I I I I t I I I I I I I I I I I I I I I I 

ACCTAATTTTTATAATTTTATTTTTTATTtAGTATTAGT AGATGAGTGAG 

GLKILK.KINHNHLLT 
MD.KY.NKK. 1 I I lYSL 
Wl KN I K I KNKS SSTH S 

TAACGATTC ACATTCTATGCACCAAATTTGACATCGGCTTCTAATTAATT 

J '' ll '' ' '|ii i' l'i'i | i i i i I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATTGCTAAGTGTAAGATAGGTGGTTTAAACTGTAGCCGAAGATTAATt/\A 
LTIHILSTKFDIGF.LI 
RFTFYPPNLTSASN.F 
NDSHS IHQI .HRLLIN 

TCATATA TTAGGTTCTAAAAAATCTCTCCCTTTGACAGATGAATAAATAT 



SYIRF.KISPFDR.INI 
HILGSKKSLPLTDE.I 
FIY.VLKNLSL.QMNKY 

TTCTTTTAA TTCGTTAGGGAAGGATCTAATATAATATATATATATATATA 

1 i ^ i 1 i ' J. ' I ' ' ' ' I ' ' ' ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGAAAATTAAGCAATCCCTTCCTAGATTATATTATATAtATATATATAt 

SFNSLGKDL I .YIYIY 
FLLIR.GRI .YNIYIYI 
FF.FVREGSNIIYIYIY 

TATTTATTTATTAGATTCTAACCATTTCTCTCACAAGAATATGAATCGAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I t I I 

ATAAATAAATAATCTAAGATTGGTAAAGAGAGTGttctTAtACTTAGCtG 
IFIY, ILTISLTRI ID 
YLF I RF PFLSQEYEST 
lYLLDSNHFSHPNMNR 



SEQA 



GGCCATATCTGCAAAAACCCACCAATllG"TTCACAGTAAACGCTCATfG]AA 

CCGGTATAGACGTTTTTGGGTGGTTAACAAGTGTCAT TTGCGAGtAACtt 

GHICKNPPIVHSKRSLN 
AISAKTHQLFTVNAH. 
RPYLQKPTNCSQ.TLIE 



FIG. 18B-1 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 991 566BA2_L> 



wo 99/15668 



PCTAJS98/03343 



79/91 



TTAAGGTCGAAATTACTTTTAAATTTCTAGAGATTTCCAATAAAATATAC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AATTCCAGCTTTAATGAAAATTTAAAGATCTCTAAAGGTTATTTTATATG 

GRNYF. ISRDFQ.NI 
IKVEITFKFLEISNKIY 
LRSKLLLNF.RFPIKYT 

TCGTATCTTTTACAGTGATGATGCTCCGGATGATAAGATGGAAGGATGCG 
I I I I I I I I I I I I I I I I I I ' I I ' I ' I I I I I I I I I I I I I I I M ' I I I ' I I I I 
AGCATAGAAAATGTCACTACTACGAGGCCTACTATTCTACCTTCCTACGC 

LVSFTVMMLRMIRWKDA 
SYLLQ..CSG. .DGRMR 
R I FYSDDAPDDKMEGC 

TGTGTCAGCCGCCTGCGATGTCTGTGGCGGGGACGAGACGAAGACAAGGA 
I I I I I I I' I I I I I ' I I ' I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I 
ACACAGTCGGCGGACGCTAGAGACACCGCCCCTGCTCTGCTTCTGTTCCT 

CVSRLRSLWRGRDEDKD 
VSAACDLCGGDETKTR 
CCQPPA I SVAGTRRRQG 

CGTGAGCGGACGATACCAAGTCTTCTCCTCCCCCACCACGCACGTCTCAG 
I I I I I I I I I I ' I I I I I' I ' M I I I I I I I ' I ' I ' I I ' I I ' I I I ' I I I I I I I 
GAACTCGCCTGCTATGGTTCAGAAGAGGAGGGGGTGGTGCGTGCAGAGTC 

VSGRYQVFSSPTTHVS 
T ADDTKSSPPPPRTSQ 
RERT I PSLLLPHHARLR 

ATTCCCGATACGGCCTATCCCGGTGGCGTGTGGACTGCACAGACGAACGA 
I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I I I 
TAAGGGCTATGCCGGATAGGGCCACCGCACACCTGACGTGTCTGCTTGCT 

DSRYGLSRWRVDCTDER 
I PDTAYPGGVWTAQTNE 
FPIRPIPVACGLHRRT 

GTAAATGCCCATCCCCCCTCTTTCATTCTTTCTCTTTGCGTGTGTGAGAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CATTTACGGGTAGGGGGGAGAAAGTAAGAAAGAGAAACGCACACACTCTC 

VNAHPPSFILSLCVCER 
MPIPPLSFFLFACVR 
SKCPSPLFHSFSLRV.E 
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GAGCGCCTATAAATAAGCACGAAACAAGCCCCTTTTCTCTCCAAGAACAC 



CTCGCGGATATTTATTCGTGCTTTGTTCGGGGAAAAGAGAGGTTCTTGTG 

SAYK ARNKPLFSPRT 
GAP I NKHETSPFSLQEH 
ERL I STKQAPFLSKNT 

ACCACACCATTGACACACTACATCCTCTGCTTCTTCGAGCCTTTTCGCCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I — I— H — I I I I I I I I I 
TGGTGTGGT AAGTGTGTGATGTAGGAGACGAAGAAGCTCGGAAAAGCGGA 

HHT I HTLHPLLLRAFSP 
TTPFTHY I LCFFEPFRL 
PHHSHTTSSASSSLFA 



TCCTTCCTCGTCTAACCATGTCGACCTGCGGCAACTGCGACTGCGTTGAC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I 

AGGAAGGAGCAGATTGGTACAGCTGGACGCCGTTGACGCTGACGCAACTG 
SFLV.PCRPAATATALT 
PSSSNHVDLRQLRLR. 
FLPRLTMSTCGNCDCVD 

AAGAGCCAGTGCGTGTAAGTCATCCTCCATCCCTCCACCTCTTCTTCTTC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I — I I I I I I I I I — iH— I — I I 1 I I I I I I 

TTCTCGGTCACGCACATTCAGT AGGAGGTAGGGAGGTGGAGAAGAAGAAG 

RASACKSSS I PPPLLL 
QEPVRV SHPPSLHLFFF 
KSQCV.VILHPSTSSSS 



I I I I I I I I I I I I I I I 1 I I I I I I I I I 



I I I I I I I — H— < — I I I I I I I I I 



>Sall 




SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCT/US98/03343 



81/91 



Sail 



TTCTTCTTCTTCTTCTTCTAACCTCGCCCCGTTTGTGTTTGATGAGTCGA 

I I I I I I I I I I I I I I I I I I I I I ' ' I I I ' I I I I I I ' I I I ' I I I I t I I ' I I I I 
AAGAAGAAGAAGAAGAAGATTGGAGCGGGGCAAACACAAACTACTCAGCT 

LLLLLLLTSPRLCLMSR 
FFFFFF.PRPVCV . VD 
SSSSSSNLAPFVFDES 



SEQB 



actcttcccacatcgctcgtcaaaactcaIgagctttattagggaactcag 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I T 'I I I I I I I I I ' I I I I I ' I 

tgagaagggtgtagcgagcagttttgagtctcgaaataatcccttgagtc 
lfphrssklrallgn i s 
sshiarqnsely.gts 
tlptslvktqsfirehq 

cIaatactatatgtatatgtanaaggtcaacgttggctgaagaagttggtt 

~r ' I I I I ' I ' I ' I I I I I I I ' M I I — I I I — I I I I I I I I I I I I I I I I I I I I I I I I 

gttatgatatacatatacatnttccagttgcaaccgacttcttgaaccaa 

nticic7rstlaeelg 
a i lyvyv7gqrwlknlv 
qyymym7kvnvg.rtwf 

ttgcctttgcaggaagaaaggaaacagctacggtatcgatattgttgaga 

-f— I — I I I 1 I I I I I I I I I I I I I I I I I I I I — i — I I I I I I I I I I I I I I I I I I I I — I— H 
AACGGAAACGTCCTTCTTTCCTTTGTCGATGCCATAGCTATAACAACTCT 

FAFAGRKETATVSILLR 
LPLQEERKQLRYRYC D 
CLCRKKGNSYGIDIVE 

CCGAGAAGAGGTACTGATTAGCTTCTTCTCCCTCCTCCTCGTCGAGGATG 

I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 1 I I I I I I I I I I I I I r I I I I I I I 
GGCTCTTCTCCATGACTAATCGAAGAAGAGGGAGGAGGAGCAGCTCCTAC 

PRRGTD.LLLPPPRRG. 
REEVLISFFSLLLVED 
TEKRY LASSPSSSSRM 

ATCAAACTAATTAGGATTACACCTTATTACCTTACCTAATGCTTTTTCCG 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 1 I I I r I I I I I I I I I 

TAGTTTGATTAATCCTAATGTGGAATAATGGAATGGATTACGAAAAAGGC 

SN. LGLHL I TLPNAFS 
DQTN.DYTLLPYLMLFP 
IKLIRITPYYLT.CFFR 



FIG. 18C-1 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID; <WO 9915668A2_I_> 



wo 99/15668 



PCTAJS98/03343 



82/91 
I Sal I 

tattcgtttcgtctcttcagctacg'tcgacgaggtgatcgttgccgcaga 

I I I I I I ' I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATAAGCAAAGCAGAGAAGTCGATGCAGCTGCTCCACTAGCAACGGCGtct 

VFVSSLQLRRRGDRCRR 
YSFRLFSYVDEV I VAAE 
IRFVSSATSTR.SLPQ 



AGCTGCCGAGCATGACGGCAAGTGCAAGTGCGGCGCCGCCTGCGCCTGCA 

I ' I ' I I I ' I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCGACGGCTCGTACTGCCGTTCACGTTCACGCCGCGGCGGACGCGGACGt 
SCRA RQVQVRRRLRLH 
AAEHDGKCKCGAACAC 
KLPSMTASASAAPPAPA 



CCGACTGCAAGTGTGGCAACLrGA^GAAGCACTTGTGTCACTACCACTAAAA 

I I I ' M ' I I I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



GGCTGACGTTCACACCGTTGACTCTTCGTGAACACAGTGATGGTGATTTT 

RLQVWQLRSTCVTTTK 
TDCKCGN.EALVSLPLN 
PTASVATEKHLCHYH. I 

AAAAGTTTG CAATGCATAAAAAACAAAAGAACAAAAAAAAAAAAGGAAGA 

I ' I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TTTTCAAACGTTACGTATTTTTTGTTTTCtTGtTtTttttttttCCttct 
KFAMHKKQKNKKKKGR 
KSLQCIKNKRTKKKKEE 
KVCNA KTKEQKKKRK 

AGAAGAAGGTGTGGCTATGTACTCTAATAATTCGGGCAGGCTGATAAGTT 
I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TCTTCTTCCACACCGATACATGAGATTATTAAGCCCGTCCGACTATTCAA 

RRRCGYVL . FGQADRL 

EEGVAMYSNNSGRL IG 

KKKVWLCTLI IRAG.. V 

GTAAGATGGGATAACGCAGTATCATCTGTGTTATCTCTGTCCTGTGTTAC 

I I I I I I I ' I I ' I ' I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CATTCTACCCTATTGCGTCATAGTAGACACAATAGAGACAGGACACAATG 

DG I TQYHLCYLCPVL 
CKMG.RSIICVISVLCY 
VRWDNAVSSVLSLSCVT 



FIG. 18C-2 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9915668A2„L> 



wo 99/15668 



PCT/US98/03343 



83/91 



AACTCTCCTATCTATCCTAGTCAATGAAATATTATTAGTATTAATCTGGT 
I I I I I I I ' I I ' I I I I I I I I I I 'I I i ' I ' I I ' I I I i I ' I I I I I ' I I I I I I I 
TTGAGAGGATAGATAGGATCAGTTACTTTATAATAATCATAATTAGACCA 

QLSYLS.SMKYY.Y.SG 
NSPIYPSQ.NI ISINLV 
ILLS I LVNE I LLVLLW 



TGTGTCATTCATATATGCTGCTGCTGCTGCTGCTTCCTCTTTCACCAATC 
I I I ' I I I ' I I ' I I ' I I ' I ' I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 
ACACAGTAAGTATATACGACGACGACGACGACGAAGGAGAAAGTGGTTAG 

CVIHICCCCCCFLFHQS 
VSF i YAAAAAASSFTN 
LCHSYMLLLLLLPLSPI 

AACCCAAAGGATCGATTGCACTGTAAGGCCCAACTTCCTCACCGATATGC 
I ' I I I I I I I I ' I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TTGGGTTTCCTAGCTAACGTGACATTCCGGGTTGAAGGAGTGGCTATACG 

TQR I DCTVRPNFLTDM 
QPKGSIAL.GPTSSPIC 
NPKDRLHCKAQLPHRYA 



SEQD 



TCGCTCAGTTACGATGAATGAACAGCAACCAAACGAGTCTGC 

I I I I I I I I I I I I I I I I I I I I I i f-L I I I I I I I I I I I I I I I I I I -L - 2392 

agcgagtcaatgctacttactt|gtcgttggtttgctcagacg| 
laqlr mnsnqtsl 
slsyde.tatkrvc 
rsvtmneqqpnesa 



FIG. 18C-3 



BNSDOCID: <WO 9915668A2J_> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCT/US98/03343 



84/91 



Apa I 



'Xhol 



Sail 
Acc I 
> Hinc II 



Cla 



Hind III 



TCACTGGTACGGGGCCCCCCTCGAGGTCGACGGTATCGATAAGCTTTGAT 



I I I I I I I 



I I I I I I ' I 



I I I I I I I i I I I I I I I I t I 1 I I I 1 1 I I I 



AGTGACCATGCCCCGGGGGGAGCCCCAGCTGCCATAGCTATTCGAAACTA 
SLVRGPPRGRRYR.ALI 
HWYGAPLEVDG I DKL 
XTGTGPPSRSTVS I SFD 

CTCTTCTCTCAATCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTGTATG 



I I I I I I ' I I I I I I I ' I I I I I I I I I I I I I I I I I ' 1 I I I I I I I I I 



I I I 



GAGAAGAGAGTTAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGACATAC 

SSLNLSLSLSLSLSLY 
SLLSISLSLSLSLSLCM 
LFSQSLSLSLSLSLSVC 

CTTTAAATATGGTTGTAATGCTGAATTGCTATGTTTATCTTGGCCCAAAG 



I I I I 1 I 



I I I I I I I I I I I I I I 



I I I I I I I I I I I I I I I I I I I I I I I 



GAAATTTATACCAACATTACGACTTAACGATACAAATAGAACCGGGTTTG 
XFKYGCNAELLCLSWPN 
SLNMVVMLNCYVYLGQT 
L. IWL.C. lAMFILAK 

TGTGTCCATCTTTGAGCAGATAAATCTGGCGATAATGTTCTTTTTACTGA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I I I I I 

ACACAGGTAGAAACTCGTCTATTTAGACCGCTATTACAAGAAAAATGACT 

CVHL .ADKSGDNVLFTE 
VSI FEQINLAIMFFLL 
LCPSLSR. IWR.CSFY. 

j Pst I 

AAGCAGTGCAGGATGAGGGCCTGAAATGACATCGGACGCCCACTGGGTCA 



I I I I I I I I ■■ t I I I I I I I I I I I I I I 



I I I I I I 



-I— I- 



■+-I- 



I I t I I 



TTCGTGACGTCCT ACTCCCGGACTTTAGTGTAGCCTGCGGGTGACCCAGT 

STAG.GPEITSDAHWV 
KALQDEGLKSHRTPTGS 
KHCRMRA NH IGRPLGH 

TGATGATATGGACTCCTCCACAGCGAGCAGCCATGGGATGTGAGATCCAC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I t- I I I I I I I I I I I I I 1 I I I I I I I I 
ACTACTATACCTGAGGAGGTGTCGCTCGTC6GTACCCT ACACTCTAGGTG 

MMIWTPPQR AAMGCEIH 
YGLLHSEQPWDVRST 
DDMDSSTASSHGM DP 



FIG. 19A-1 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9915668A2J_> 



wo 99/15668 



PCTAJS98/03343 



85/91 



ATAGCAGCGTAGATAAGGGAAGCCCGCAACACTAGGCTGTTGTTGTTCCA 
I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 
TATCGTCGCATCTATTCCCTTCGGGCGTTGTGATCCGACAACAACAAGGT 

XAA. IREARNTRLLLFQ 
QRR.GKPATLGCCCS 
HSSVDKGSPQH AVVVP 

GTAAAGATCGAAAGGTCAGGCGACAGTGACGATCGACTTTTTCGAGCATG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I ' I I ' I ' I 
CATTTCTAGCTTTCCAGTCCGCTGTCACTGCTAGCTGAAAAAGCTCGTAC 

RSKGQATVT I DFFEH 
SKDRKVRRQ RSTFSSM 
VKIERSGDSD DRLFRA. 

ATGACAACGACGACCTGCTCCTGCAATATCCGTCCCCTACCGTAGAGTGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I 
TACTGTTGCTGCTGGACGAGGACGTTATAGGCAGGGGATGGCATCTCACC 

DDNDDLLLQYPSPTVEW 
MTTTTCSCNIRPLP.SG 
QRRPAPA I SVPYRRV 

GAATAAATGGGTTTGTAGTTGCACTATTTCTCGCAGGAATTAATTGAAAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I 
CTAATTTACCCAAACATCAACGTGATAAAGAGCGTCCTTAATTAACTTTC 

E.MGL .LHYFSQELIES 
NKWVCSCTISRRN.LK 
GINGFVVALFLAGIN.K 



FIG. 19A-2 



BNSDOCID: <WO 991 5668A2_I_> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCT/US98/03343 



86/91 



CCCTGCA AATTGCTGTTTCTCTTTCCTTATATTAAACCTTCCTCCTGTTA 

I I I M I ' I ' i I ' I I I I I I M I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I 
GGGACGTTTAACGACAAAGAGAAAGGAATATAATTTGGAAGGAGGACAAt 

PANCCFSFL I LNLPPV 
ALQiAVSLSLY.TFLLL 
PCKLLFLFPY I KPSSCY 

; BamH I 

cattaaaattgcatgttaagacatttctgtatg'gatccgaacatgagatc 
gtaattttaacgtacaattctgtaaagacatacctaggcttgtactctag 

TLKLHVKTFLYGSEHEI 
H .NCMLRHFCMDPNMRS 
IKIAC.DISVWIRT.D 

tatcattgaagtaatgggtaggatttacattatcatcatcatcatcatct 

J i |i| i' i i |iiii|'i i ' |i i ii |i iii | i i ii |' ii i |iiii| ii i i| 

atagtaacttcattacccatcctaaatgtaatagtagtagtagtagtaga 

yh.sng.dlhyhhhhhl 
iievmgriyiiiiiii 

lslk.wvgftlssssss 

i BstX I 

ccatgggt'ttggatctaattagaccgaaaacctcatttaaaatccaaccc 

M ' I I I I I ' I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGTACCCAAACCTAGATTAATCTGGCTTTTGGAGTAAAttttAG^TTGGG 

HGFGSN.TENLI.NPT 
SMGLDLIRPKTSFKIQ P 
PWVWI LDRKPHLKSNP 



FIG. 19A-3 



BNSDOCID: <WO_991 S66eA2J_> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCT/US98/03343 



87/91 



XXATATTGGCTTGACTTGCTCCATCTCCAAGAAAAATACAACAAGAACAA 
I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
XXTATAACCGAACTGAACGAGGTAGAGGTTCTTTTTATGTTGTTCTTGTT 

XILA.LAPSPRKIQQEQ 
XYWLDLLHLQEKYNKNN 
NIGLTGSISKKNTTRT 



CAAAAATTTAGGATGCACATTGAATTGATTTGGTCACTATGAGAGAATCA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I I ' I ' I 
GTTTTTAAATCCTACGTGTAACTTAACTAAACCAGTGATACTCTCTTAGT 

QKFRMHIELIWSL.ENH 
KNLGCTLN FGHYERI 
TKI .DAH. IDLVTMRES 

TGGATTAAAAATATTAAAATAAAAAATAAATCATAATCATCTACTGACTC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ACCTAATTTTTATAATTTTATTTTTTATTTAGTATTAGTAGATGAGTGAG 

GLKILK.KINHNHLLT 
D.K Y.NKK. I I I lYSL 
WIKNIKIKNKS.SSTHS 

TAACGATTCACATTCTATCCACCAAATTTGACATCGGCTTCTAATTAATT 
I I I I I I I I I I I I I I I ' I ' I I I I ' I I I I I I I I I I I I I I ' I I I I ' I I I ' I ' I 
ATTGCTAAGTGTAAGATAGGTGGTTTAAACTGTAGCCGAAGATTAATTAA 

LTiHILSTKFDIGF.LI 
RFTFYPPNLTSASN.F 
NDSHSIHQI .HRLLIN 

TCATATATTAGGTTCTAAAAAATCTCTCCCTTTGACAGATGAATAAATAT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I ' I I I 'I I I j. I ; I I 
AGTATATAATCCAAGATTTTTTAGAGAGGGAAACTGTCT ACTTATTTATA 

SYIRF.KISPFDR.INI 

HILGSKKSLPLTDE.I 

FIY.VLKNLSL.QMNKY 

TTCTTTTAATTCGTTAGGGAAGGATCTAATATAATATATATATATATATA 

| i ii|iiii| i ii i| iiii |i i ii |iii'l'i i il ' iii M''.!.| i ji! l 
AAGAAAATTAAGCAATCCCTTCCTAGATTATATTATATATATATATATAT 

SFNSLGKDLI.YIYIY 
FLLIR.GRI .YNIYIYI 
FF. FVREGSNI lYIYlY 



FIG. 19B-1 



SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCT/US98/03343 



88/91 



TATTTATTTATTAGATTCTA ACCATTTCTCTCACCAGAATATGAATCGAC 

M ' M I I I ' I I ' I ' M 'I I I ' I I I i I I I I I I I I I I I I I I I I I i I I I I I I I 
ATAAATAAATAATCTAAGATTGGTAAAGAGAGTGGTCTTATACTtAGCTG 
IFIY. ILTISLTRI ID 
YLFIRF.PFLSPEYEST 
lYLLDSNHFSHQNMNR 



MTZ SEQ A 



9Q9CATATCTGCAAAAACCCACC AATTGTTCACAGTAAACGCTCATTGAA 
CCGGTATAGACGTTTTTGGGTGGttAACAAGTGTCATTTGCGAGTAACTT 

GHICKNPPIVHSKRSLN 
AISAKTHQLFTVNAH 
RPYLQKPTNCSQ.TLIE 

jXbal 

T1"AAGGTCGAAATTACT-rTTAAATT T'cTAGAGATTTCCAATAAAATATAC 

I I I I I I I I I I I I I I I I I I I I I I I ' I I I I I I I I I I I I I I I I I I I I I I 

AATTCCAGCTTTAAtGAAAATTtAAAGATCTCTAAAGGTTATTTTATATG 

GRNYF. ISRDFQ.NI 
IKVEITFKFLEISNKIY 
LRSKLLLNF.RFPIKYT 

TCGTATCTTT TACAGTGATGATGCTCCGGATGATAAGATGGAAGGATGCG 
AGCAtAGAAAATGtCACtACTACGAGGCCTACTATTCTA^ 

LVSFTVMMLRMIRWKDA 
SYLLQ. .CSG. .DGRMR 
Rl FYSDDAPDDKMEGC 

TGTGTC AGCCGCCTGCGATCTGTGTGGCGGGGACGAGACGAAGACAAGGA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACACAGTCGGCGGACGCTAGAGACACCGCCCCtGCtCTGCtTCtGttcct 
CVSRLRSLWRGRDEDKD 
VSAACDLCGGDETKTR 
VCQPPA I SVAGTRRRQG 

CGTGAGCGG ACGATACCAAGTCTTCTCCTCCCCCACCACGCACGTCTCAG 

T I I ' M ' I ' I I I ' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

G C A C T C G C C T G C T A T G G T T C A G A A A A G G A G G G GGTGGTGCGTGCAGAGTC 

VSGRYQVFSSPTTHVS 
T .ADDTKSSPPPPRTSQ 
RERTIPSLLLPHHARLR 



FIG. 19B-2 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 991 5668A2_L> 



wo 99/15668 



PCT/US98/03343 



89/91 



ATTCCCGATACGGCCTATCCCGGTGGCGTGTGGACTGCACAGACGAACGA 
I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 
TAAGGGCTATGCCGGATAGGGCCACCGCACACCTGACGTGTCTGCTTGCT 

DSRYGLSRWRVDCTDER 
I PDTAYPGGVWTAQTNE 
FPIRPIPVACGLHRRT 

GTAAATGCCCATCCCCCCTCTTTCATTCTTTCTCTTTGCGTGTGTGAGAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I ' I I I' I I I I I ' I I I ' I I 
CATTTACGGGTAGGGGGGAGAAAGTAAGAAAGAGAAACGAACACACTCTC 

VNAHPPSF I LSLCVCER 
MP I PPLSFFLFACVR 
SKCPSPLFHSFSLRV.E 

GAGCGCCTATAAATAAGCACGAAACAAGCCCCTTTTCTCTCCAAGAACAC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' I I' I 
CTCGCGGATATTTATTCGTGCTTTGTTCGGGGAAAAGAGAGGTTCTTGTG 

SAYK.ARNKPLFSPRT 
GAP 1 NKHETSPFSLQEH 
ERL ISTKQAPFLSKNT 

ACCACACCATTCACACACTACATCCTCTGCTTCTTCGAGCCTTTTCGCCT 
I I I I I I I I I I ' I I ' I I ' I I I I I I I I I I I ' I ' I ' I I I I ' I I I I I ' I ' I ' I I 
TGGTGTGGTAAGTGTGTGATGT AGGAGACGAAGAAGCTCGGAAAAGCGGA 

HHT I HTLHPLLLRAFSP 
TTPFTHYILCFFEPFRL 
PHHSHTTSSASSSLFA 




BNSDOCID: <WO 9915668A2_I_> 



SUBSTITUTE SHEET (RULE 26) 



wo 99/15668 



PCTAJS98/03343 



90/91 

iSal I 
Acc I 

; Hind II Hind I! '< 

^9?TT?9T9?"!"^"^^^^^^"^?"^^Qac ctgcggcaactgcgactgcgtt'gac 

XGGAAGGAGeAGATtGGTAtAGCTGGACGCCGTTGACGCTGAC^ 

SFLV.PCRPAATATALT 
PSSSNHVDLRQLRLR 
XLPRLTMSTCGNCDCVD 

j INTRON 

aagagccagtgcgtg'taagtcatcc tccatccctccacctcttccccttc 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 
TTCTCGGTCACGCACATTCAGTAGGAGGTAGGGAGGtGGAGAAGGGGAAG 

Rasacksssippp'lll 
qepvrvshppslhlfff 
ksqcv.vilhpssssss 



Hind II 
Acc 
Sail 

TTCTTCTTCTTCTTCTTCTAACCTCGCCCCGTTTGTGTTTGATGAGfCGA 

I ■ I ' I I I I ' I I I I I [ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGAAGAAGAAGAAGAAGATTGGAGCGGGGCAAACACAAACTACtCAGCt 
LLLLLLLTSPRLCLMSR 
FFFFFF.PRPVCV. .VD 
SSS SSSNLAPFVFDES 



MTZ SEQ B ^ 

CTCT TCCCACATCGCTCGTCAAAACTCAGAGCTTTATTAGGGAACATCAG 

I I I ' I ' I I' I I I ' I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I 

GAGAAGGGTGTAGCGAGCAGTTTTGAGTCtCGAAAtAAtCCCttGTAGTC 
L FPHRSSKLRALLGNIS 
SSHIARQNSELY.GTS 
TLPTSLVKTQSF I REHQ 



FIG. 19C-1 



SUBSTITUTE SHEET (RULE 26) 



BNSDCCID: <WO 991566eA2_L> 



wo 99/15668 



PCTAJS98/03343 



Hinc 11 
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1. Claims: 1-3, 6-13»15-33 partially 

An isolated and purified banana DNA molecule being 
differentially expressed during banana fruit development, 
especially being a starch synthase, corresponding to 
PBAN3-33 or pBAN 3-18, and the corresponding protein. 
Chimeric genes, vectors, compositions, plant cells and 
plants comprising said DNA molecules or proteins. 
A regulatory element of banana which is 5' or 3* to a gene 
differentially expressed during banana fruit development, 
activated by ethylene, chimeric genes, plant cells, and 
plants comprising said element. 

A method for the expression of a heterologous protein in 
fruit employing said chimeric genes, preferably for the 
expression and purification of a therapeutic protein, fruit 
and protein produced thereby. 



2. Claims: 1-3,5-33 partially; 4 completely 

idem, the DNA sequence encoding a chitinase or 
endochitinase, preferably corresponding to pBAN 3-30, 
PBAN3-24, SEQ ID N0:l-3, Fig. 16,17; the protein 
corresponding to SEQ ID NO: 4-6 



3. Claims: 1-3,6-13,15-33 partially 

idem, the DNA sequence encoding a beta-1,3 glucanase, 
preferably corresponding to pBAN 1-3 



4. Claims: 1-3,6-13,15-33 partially 

idem, the DNA sequence encoding a thaumatin-like protein, 
preferably corresponding to pBAN 3-28 



5. Claims: 1-3,6-13,15-33 partially 

idem, the DNA sequence encoding an ascorbate peroxidase, 
preferably corresponding to pBAN 3-25 



6. Claims: 1-3,5-33 partially 

idem, the DNA sequence encoding a metal lothionein, 
preferably corresponding to pBAN 3-6, pBAN3-23, Fig. 18,19 



7. Claims: 1-3,6-13,15-33 partially 

idem, the DNA sequence encoding a lectin, preferably 
corresponding to pBAN 3-32 
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8. Claims: 1-3,6-13,15-33 partially 

idem, the DNA sequence encoding a senescence-related 
protein, preferably corresponding to pBAN 3-46 
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